Tagging English by path voting constraints

Authors:
Gökhan Tür;Kemal Oflazer
Affiliations:
Bilkent University, Bilkent, Ankara, Turkey;Bilkent University, Bilkent, Ankara, Turkey
Venue:
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Year:
1998

Citing 11
Cited 2

Grammatical category disambiguation by statistical optimization

Computational Linguistics
Some advances in transformation-based part of speech tagging

AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging

Computational Linguistics
Constraint Grammar: A Language-Independent System for Parsing Unrestricted Text

Constraint Grammar: A Language-Independent System for Parsing Unrestricted Text
Coping with ambiguity and unknown words through probabilistic models

Computational Linguistics - Special issue on using large corpora: II
A stochastic parts program and noun phrase parser for unrestricted text

ANLC '88 Proceedings of the second conference on Applied natural language processing
A practical part-of-speech tagger

ANLC '92 Proceedings of the third conference on Applied natural language processing
A simple rule-based part of speech tagger

ANLC '92 Proceedings of the third conference on Applied natural language processing
Ambiguity resolution in a reductionistic parser

EACL '93 Proceedings of the sixth conference on European chapter of the Association for Computational Linguistics
A syntax-based part-of-speech analyser

EACL '95 Proceedings of the seventh conference on European chapter of the Association for Computational Linguistics
Morphological disambiguation by voting constraints

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics

A second-order Hidden Markov Model for part-of-speech tagging

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Implementing voting constraints with finite state transducers

FSMNLP '09 Proceedings of the International Workshop on Finite State Methods in Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe a constraint-based tagging approach where individual constraint rules vote on sequences of matching tokens and tags. Disambiguation of all tokens in a sentence is performed at the very end by selecting tags that appear on the path that receives the highest vote. This constraint application paradigm makes the outcome of the disambiguation independent of the rule sequence, and hence relieves the rule developer from worrying about potentially conflicting rule sequencing. The approach can also combine statistically and manually obtained constraints, and incorporate negative constraint rules to rule out certain patterns. We have applied this approach to tagging English text from the Wall Street Journal and the Brown Corpora. Our results from the Wall Street Journal Corpus indicate that with 400 statistically derived constraint rules and about 800 hand-crafted constraint rules, we can attain an average accuracy of 97.89% on the training corpus and an average accuracy of 97.50% on the testing corpus. We can also relax the single tag per token limitation and allow ambiguous tagging which lets us trade recall and precision.