Some advances in transformation-based part of speech tagging
AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition
Morphological tagging: data vs. dictionaries
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
A simple rule-based part of speech tagger
ANLC '92 Proceedings of the third conference on Applied natural language processing
Hi-index | 0.00 |
Machine learning techniques based on transformation rules have proven to be a viable alternative to stochastic tagging, achieving similar accuracy while having many advantages such as simplicity and better portability to other languages. However, data sparsity remains one of the greatest obstacles to tagging languages with complex morphology. Research in POS tagging for Serbian language described in this paper has resulted in several original ideas for improving tagging accuracy and overcoming problems related to data sparsity for highly inflected languages. The POS tagger for Serbian described in this paper achieves an error rate of 10.0% when trained on a previously annotated text corpus containing 190,000 words, which is comparable with results reported for some other languages with a similar level of inflection.