Modeling and learning multilingual inflectional morphology in a minimally supervised framework
Modeling and learning multilingual inflectional morphology in a minimally supervised framework
Automatic rule induction for unknown-word guessing
Computational Linguistics
Bootstrapping morphological analyzers by combining human elicitation and machine learning
Computational Linguistics
Unsupervised learning of word-category guessing rules
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Morph-based speech recognition and modeling of out-of-vocabulary words across languages
ACM Transactions on Speech and Language Processing (TSLP)
Multilingual noise-robust supervised morphological analysis using the WordFrame model
SIGMorPhon '04 Proceedings of the 7th Meeting of the ACL Special Interest Group in Computational Phonology: Current Themes in Computational Phonology and Morphology
Bootstrapping deep lexical resources: resources for courses
DeepLA '05 Proceedings of the ACL-SIGLEX Workshop on Deep Lexical Acquisition
An analogical learner for morphological analysis
CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
A probabilistic model for guessing base forms of new words by analogy
CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
Part-of-speech tagging using parallel weighted finite-state transducers
IceTAL'10 Proceedings of the 7th international conference on Advances in natural language processing
Hi-index | 0.00 |
Language software applications encounter new words, e.g., acronyms, technical terminology, names or compounds of such words. In order to add new words to a lexicon, we need to indicate their inflectional paradigm. We present a new generally applicable method for creating an entry generator, i.e. a paradigm guesser, for finite-state transducer lexicons. As a guesser tends to produce numerous suggestions, it is important that the correct suggestions be among the first few candidates. We prove some formal properties of the method and evaluate it on Finnish, English and Swedish full-scale transducer lexicons. We use the open-source Helsinki Finite-State Technology [1] to create finite-state transducer lexicons from existing lexical resources and automatically derive guessers for unknown words. The method has a recall of 82-87 % and a precision of 71-76 % for the three test languages. The model needs no external corpus and can therefore serve as a baseline.