Implementing voting constraints with finite state transducers

Authors:
Kemal Oflazer;Gökhan Tür
Affiliations:
Bilkent University, Bilkent, Ankara, Turkey;Bilkent University, Bilkent, Ankara, Turkey
Venue:
FSMNLP '09 Proceedings of the International Workshop on Finite State Methods in Natural Language Processing
Year:
1998

Citing 10
Cited 0

Regular models of phonological rule systems

Computational Linguistics - Special issue on computational phonology
Deterministic part-of-speech tagging with finite-state transducers

Computational Linguistics
Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging

Computational Linguistics
Introduction to Algorithms

Introduction to Algorithms
A stochastic parts program and noun phrase parser for unrestricted text

ANLC '88 Proceedings of the second conference on Applied natural language processing
Tagging and morphological disambiguation of Turkish text

ANLC '94 Proceedings of the fourth conference on Applied natural language processing
Morphological disambiguation by voting constraints

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Tagging English by path voting constraints

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Directed replacement

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Finite-state parsing and disambiguation

COLING '90 Proceedings of the 13th conference on Computational linguistics - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe a constraint-based morphological disambiguation system in which individual constraint rules vote on matching morphological parses followed by its implementation using finite state transducers. Voting constraint rules have a number of desirable properties: The outcome of the disambiguation is independent of the order of application of the local contextual constraint rules. Thus the rule developer is relieved from worrying about conflicting rule sequencing. The approach can also combine statistically and manually obtained constraints, and incorporate negative constraints that rule out certain patterns. The transducer implementation has a number of desirable properties compared to other finite state tagging and light parsing approaches, implemented with automata intersection. The most important of these is that since constraints do not remove parses there is no risk of an overzealous constraint "killing a sentence" by removing all parses of a token during intersection. After a description of our approach we present preliminary results from tagging the Wall Street Journal Corpus with this approach. With about 400 statistically derived constraints and about 570 manual constraints, we can attain an accuracy of 97.82% on the training corpus and 97.29% on the test corpus. We then describe a finite state implementation of our approach and discuss various related issues.