Automatic text processing: the transformation, analysis, and retrieval of information by computer
Automatic text processing: the transformation, analysis, and retrieval of information by computer
Viewing morphology as an inference process
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Towards new measures of information retrieval evaluation
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Stemming algorithms: a case study for detailed evaluation
Journal of the American Society for Information Science - Special issue: evaluation of information retrieval systems
Viewing stemming as recall enhancement
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Information Processing and Management: an International Journal
Information Retrieval
Bootstrapping dictionaries for cross-language information retrieval
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Question processing and clustering in INDOC: a biomedical question answering system
EURASIP Journal on Bioinformatics and Systems Biology
Multilingual term extraction from domain-specific corpora using morphological structure
EACL '06 Proceedings of the Eleventh Conference of the European Chapter of the Association for Computational Linguistics: Posters & Demonstrations
Hi-index | 0.00 |
Document retrieval in languages with a rich and complex morphology - particularly in terms of derivation and (single-word) composition - suffers from serious performance degradation with the stemming-only query-term-to-text-word matching paradigm. We propose an alternative approach in which morphologically complex word forms are segmented into relevant subwords (such as stems, named entities, acronyms), and subwords constitute the basic unit for indexing and retrieval. We evaluate our approach on a large biomedical document collection.