Affisix: Tool for Prefix Recognition
TSD '08 Proceedings of the 11th international conference on Text, Speech and Dialogue
Automatic recognition of czech derivational prefixes
CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
Smart paradigms and the predictability and complexity of inflectional morphology
EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Hi-index | 0.00 |
If a corpus is submitted to a morphological analysis, there always remain some words that the analyser could not recognize (foreign names, misspellings, ...). However, if a human reads the texts, he usually understands them, even if he does not know as many words as there are in the lexicon used by the morphological analyser. The language itself helps him to recognize unknown words. It is not only semantics or syntax but also pure morphology of unknown words that can contribute to their understanding. In this article, I describe a "guesser" that can lower the amount of unrecognized words after the "classical" morphological analysis of the Czech texts. It was tested on the Czech National Corpus.