Regular models of phonological rule systems
Computational Linguistics - Special issue on computational phonology
Deterministic part-of-speech tagging with finite-state transducers
Computational Linguistics
Introduction to Algorithms
A stochastic parts program and noun phrase parser for unrestricted text
ANLC '88 Proceedings of the second conference on Applied natural language processing
Tagging and morphological disambiguation of Turkish text
ANLC '94 Proceedings of the fourth conference on Applied natural language processing
Morphological disambiguation by voting constraints
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Tagging English by path voting constraints
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Finite-state parsing and disambiguation
COLING '90 Proceedings of the 13th conference on Computational linguistics - Volume 2
Hi-index | 0.00 |
We describe a constraint-based morphological disambiguation system in which individual constraint rules vote on matching morphological parses followed by its implementation using finite state transducers. Voting constraint rules have a number of desirable properties: The outcome of the disambiguation is independent of the order of application of the local contextual constraint rules. Thus the rule developer is relieved from worrying about conflicting rule sequencing. The approach can also combine statistically and manually obtained constraints, and incorporate negative constraints that rule out certain patterns. The transducer implementation has a number of desirable properties compared to other finite state tagging and light parsing approaches, implemented with automata intersection. The most important of these is that since constraints do not remove parses there is no risk of an overzealous constraint "killing a sentence" by removing all parses of a token during intersection. After a description of our approach we present preliminary results from tagging the Wall Street Journal Corpus with this approach. With about 400 statistically derived constraints and about 570 manual constraints, we can attain an accuracy of 97.82% on the training corpus and 97.29% on the test corpus. We then describe a finite state implementation of our approach and discuss various related issues.