On some applications of finite-state automata theory to natural language processing

  • Authors:
  • Mehryar Mohri

  • Affiliations:
  • AT&T Bell Laboratories, 600 Mountain Avenue, Murray Hill, NJ 07974, USA. E-mail: mohri@research.att.com

  • Venue:
  • Natural Language Engineering
  • Year:
  • 1996

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe new applications of the theory of automata to natural language processing: the representation of very large scale dictionaries and the indexation of natural language texts. They are based on new algorithms that we introduce and describe in detail. In particular, we give pseudocodes for the determinisation of string to string transducers, the deterministic union of p-subsequential string to string transducers, and the indexation by automata. We report on several experiments illustrating the applications.