Automatic Language-Specific Stemming in Information Retrieval
CLEF '00 Revised Papers from the Workshop of Cross-Language Evaluation Forum on Cross-Language Information Retrieval and Evaluation
Bootstrapping the Albanian Information Retrieval
BCI '09 Proceedings of the 2009 Fourth Balkan Conference in Informatics
Poor man’s stemming: unsupervised recognition of same-stem words
AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
Hi-index | 0.00 |
The purpose of this work is to define a methodology for building simple but robust stemmers, without having knowledge of the stemmer's target language. The target stemmer is based on conditional suffix replacement (actually suffix removal) in one or more steps. The building process (that refines the stemmer) uses the arguments of experts against the results of a primary stemmer. Even the experts did not need be speakers of the target language. They have available the original words, their translations (in their native language) and the results (stems) produced by the primary stemmer. The language resources are only a list of suffixes (used in the target language) and the translations of the terms existing in a corpus of texts from the target language.