SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Guessing morphology from terms and corpora
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Corpus-based stemming using cooccurrence of word variants
ACM Transactions on Information Systems (TOIS)
An algorithm for suffix stripping
Readings in information retrieval
n-gram/2L: a space and time efficient two-level n-gram inverted index structure
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Restricted inflectional form generation in management of morphological keyword variation
Information Retrieval
Structural optimization of a full-text n-gram index using relational normalization
The VLDB Journal — The International Journal on Very Large Data Bases
TinyLex: static n-gram index pruning with perfect recall
Proceedings of the 17th ACM conference on Information and knowledge management
Addressing morphological variation in alphabetic languages
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
JHU ad hoc experiments at CLEF 2008
CLEF'08 Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access
Managing misspelled queries in IR applications
Information Processing and Management: an International Journal
CRTER: using cross terms to enhance probabilistic information retrieval
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
A novel corpus-based stemming algorithm using co-occurrence statistics
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
An unsupervised method to improve Spanish stemmer
NLDB'11 Proceedings of the 16th international conference on Natural language processing and information systems
Is a morphologically complex language really that complex in full-text retrieval?
FinTAL'06 Proceedings of the 5th international conference on Advances in Natural Language Processing
Exploring new languages with HAIRCUT at CLEF 2005
CLEF'05 Proceedings of the 6th international conference on Cross-Language Evalution Forum: accessing Multilingual Information Repositories
STEMBR: a stemming algorithm for the Brazilian Portuguese language
EPIA'05 Proceedings of the 12th Portuguese conference on Progress in Artificial Intelligence
Tools for nominalization: an alternative for lexical normalization
PROPOR'06 Proceedings of the 7th international conference on Computational Processing of the Portuguese Language
Effective and Robust Query-Based Stemming
ACM Transactions on Information Systems (TOIS)
Hi-index | 0.00 |
Stemming can improve retrieval accuracy, but stemmers are language-specific. Character n-gram tokenization achieves many of the benefits of stemming in a language independent way, but its use incurs a performance penalty. We demonstrate that selection of a single n-gram as a pseudo-stem for a word can be an effective and efficient language-neutral approach for some languages.