Grammatical category disambiguation by statistical optimization
Computational Linguistics
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
A stochastic parts program and noun phrase parser for unrestricted text
ANLC '88 Proceedings of the second conference on Applied natural language processing
A practical part-of-speech tagger
ANLC '92 Proceedings of the third conference on Applied natural language processing
Specifying a shallow grammatical representation for parsing purposes
EACL '95 Proceedings of the seventh conference on European chapter of the Association for Computational Linguistics
"The first million is hardest to get": building a large tagged corpus as automatically as possible
COLING '90 Proceedings of the 13th conference on Computational linguistics - Volume 3
Hi-index | 0.00 |
Most evaluations of part-of-speech tagging compare the utput of an automatic tagger to some established standard, define the differences as tagging errors and try to remedy them by, e.g., more training of the tagger. The present article is based on a manual analysis of a large number of tagging errors. Some clear patterns among the errors can be discerned, and the sources of the errors as well as possible alternative methods of remedy are presented and discussed. In particular are the problems with undecidable cases treated.