Morphology in Machine Translation Systems: Efficient Integration of Finite State Transducers and Feature Structure Descriptions

Authors:
Jan W. Amtrup
Affiliations:
Kofax Image Products, San Diego, USA 92121
Venue:
Machine Translation
Year:
2003

Citing 14
Cited 5

A computational framework for lexical description

Computational Linguistics - Special issue of the lexicon
Unification: a multidisciplinary survey

ACM Computing Surveys (CSUR)
Recognition and generation of word forms for natural language understanding systems: integrating two-level morphology and feature unification

Applied Artificial Intelligence
The logic of typed feature structures

The logic of typed feature structures
Regular models of phonological rule systems

Computational Linguistics - Special issue on computational phonology
A design principles of a weighted finite-state transducer library

Theoretical Computer Science - Special issue on implementing automata
Minimization algorithms for sequential transducers

Theoretical Computer Science
Constructing NFA s by Optimal Use of Positions in Regular Expressions

CPM '02 Proceedings of the 13th Annual Symposium on Combinatorial Pattern Matching
Finite-state transducers in language and speech processing

Computational Linguistics
Compiling regular formalisms with rule features into finite-state automata

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Feature-based allomorphy

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Morphology with a null-interface

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Multi-tape two-level morphology: a case study in semitic non-linear morphology

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Feature structures, unification and finite-state transducers

FSMNLP '09 Proceedings of the International Workshop on Finite State Methods in Natural Language Processing

Strengths and weaknesses of finite-state technology: A case study in morphological grammar development

Natural Language Engineering
Finite-State Technology as a Programming Environment

CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing
Semitic morphological analysis and generation using finite state transducers with feature structures

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Enhanced word decomposition by calibrating the decision threshold of probabilistic models and using a model ensemble

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Mining and classification of neologisms in Persian blogs

CALC '10 Proceedings of the NAACL HLT 2010 Second Workshop on Computational Approaches to Linguistic Creativity

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a finite state morphology system augmented with typed feature structures as weights on transitions. This mechanism allows the use of highly efficient finite state approaches for morphological analysis and generation, while providing the rich linguistic descriptions often used in Machine Translation systems. Using a semiring interpretation, the weight of a morphological analysis result represents the possible linguistic interpretations of an input word, while the resulting character string itself represents the lemma of the input. Long-distance phenomena and infixation can be handled in an easy and elegant manner, simultaneously providing a seamless interface to subsequent linguistic processing modules. Two extensions to the basic model are discussed: the incorporation of lexical knowledge into the finite state transducer and a transformation that renders unification-based finite state models as efficient as those employing other weight structures. The model is applied to morphological operations in a Persian--English Machine Translation system.