Machine Translation
Unsupervised learning of the morphology of a natural language
Computational Linguistics
Unsupervised learning of word-category guessing rules
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Learning part-of-speech guessing rules from lexicon: extension to non-concatenative operations
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Hi-index | 0.00 |
This paper presents a methodology for the automatic acquisition of lexical and morpho-syntactic information from raw corpora. The system uses information about the inflectional morphology declared by rules and is based on the co-occurrence of different forms of the same paradigm in the corpus. A direct application of this methodology gives very poor precision rates due to rule interaction between paradigms. We present a rule analysis algorithm that solves this problem, giving quite better precision rates, although recall decreases dramatically. Finally, we investigate some techniques to raise the recall, achieving recall rates around 67% with a precision of 92%.