Transducer Minimization and Information Compression for NooJ Dictionaries

  • Authors:
  • Slim Mesfar;Max Silberztein

  • Affiliations:
  • LASELDI, Université de Franche-Comté, Besancon, France;LASELDI, Université de Franche-Comté, Besancon, France

  • Venue:
  • Proceedings of the 2009 conference on Finite-State Methods and Natural Language Processing: Post-proceedings of the 7th International Workshop FSMNLP 2008
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we describe the use of an incremental construction method of minimal, acyclic, deterministic FST. The approach consists in constructing a transducer in a single step by adding new strings one by one and minimizing the resultant automaton incrementally. Then, we present a new method to encode the morphological information associated with the dictionary entries. The new encoding unifies a large number of word forms' analyses, thus reducing the number of terminal states of the dictionary's FST, that triggers a more efficient minimization process. Finally, we present experimental results on the FST that represents the Arabic dictionary.