Part-of-Speech Tagging Using Decision Trees
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Head-driven statistical models for natural language parsing
Head-driven statistical models for natural language parsing
Supertagging: an approach to almost parsing
Computational Linguistics
TnT: a statistical part-of-speech tagger
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Tagging inflective languages: prediction of morphological categories for a rich, structured tagset
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Principle-based parsing without overgeneration
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Inducing multilingual text analysis tools via robust projection across aligned corpora
HLT '01 Proceedings of the first international conference on Human language technology research
Transformation-based learning in the fast lane
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Minimally supervised induction of grammatical gender
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Improved statistical alignment models
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Using 'smart' bilingual projection to feature-tag a monolingual dictionary
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Hi-index | 0.00 |
This paper presents an original approach to part-of-speech tagging of fine-grained features (such as case, aspect, and adjective person/number) in languages such as English where these properties are generally not morphologically marked. The goals of such rich lexical tagging in English are to provide additional features for word alignment models in bilingual corpora (for statistical machine translation), and to provide an information source for part-of-speech tagger induction in new languages via tag projection across bilingual corpora. First, we present a classifier-combination approach to tagging English bitext with very fine-grained part-of-speech tags necessary for annotating morphologically richer languages such as Czech and French, combining the extracted features of three major English parsers, and achieve fine-grained-tag-level syntactic analysis accuracy higher than any individual parser. Second, we present experimental results for the cross-language projection of part-of-speech taggers in Czech and French via word-aligned bitext, achieving successful fine-grained part-of-speech tagging of these languages without any Czech or French training data of any kind.