Learning part-of-speech guessing rules from lexicon: extension to non-concatenative operations

Authors:
Andrei Mikheev
Affiliations:
University of Edinburgh, Edinburgh, Scotland, UK
Venue:
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Year:
1996

Citing 4
Cited 5

Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging

Computational Linguistics
Coping with ambiguity and unknown words through probabilistic models

Computational Linguistics - Special issue on using large corpora: II
Unsupervised learning of word-category guessing rules

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Part-of-speech tagging with neural networks

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1

Email classification for contact centers

Proceedings of the 2003 ACM symposium on Applied computing
Automatic rule induction for unknown-word guessing

Computational Linguistics
POS disambiguation and unknown word guessing with decision trees

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Automatic thesaurus generation through multiple filtering

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Automatic lexical acquisition from raw corpora: an application to Russian

MorphSlav '03 Proceedings of the 2003 EACL Workshop on Morphological Processing of Slavic Languages

Quantified Score

Hi-index	0.00

Visualization

Abstract

One of the problems in part-of-speech tagging of real-word texts is that of unknown to the lexicon words. In (Mikheev, 1996), a technique for fully unsupervised statistical acquisition of rules which guess possible parts-of-speech for unknown words was proposed. One of the over-simplification assumed by this learning technique was the acquisition of morphological rules which obey only simple concatenative regularities of the main word with an affix. In this paper we extend this technique to the non-concatenative cases of suffixation and assess the gain in the performance.