A Word Analysis System for German Hyphenation, Full Text Search, and Spell Checking, with Regard to the Latest Reform of German Orthography

  • Authors:
  • Gabriele Kodydek

  • Affiliations:
  • -

  • Venue:
  • TDS '00 Proceedings of the Third International Workshop on Text, Speech and Dialogue
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

In text processing systems, German words require special treatment because of the possibility to form compound words as a combination of existing words. To this end, a universal word analysis system will be introduced which allows an analysis of all words in German texts according to their atomic components. A recursive decomposition algorithm, following the rules for word flexion, derivation, and compound generation in the German language, splits words into their smallest relevant parts (= atoms), which are stored in an atom table. The system is based on the foundations described in this article, and is being used for reliable, sense-conveying hyphenation, as well as for sense-conveying full text search, and in limited form also as a spelling checker.