Viewing morphology as an inference process
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Stemming algorithms: a case study for detailed evaluation
Journal of the American Society for Information Science - Special issue: evaluation of information retrieval systems
Viewing stemming as recall enhancement
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Information Processing and Management: an International Journal
Improving stemming for Arabic information retrieval: light stemming and co-occurrence analysis
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
MARS: a retrieval tool on the basis of morphological analysis
SIGIR '84 Proceedings of the 7th annual international ACM SIGIR conference on Research and development in information retrieval
A Language-Independent Approach to European Text Retrieval
CLEF '00 Revised Papers from the Workshop of Cross-Language Evaluation Forum on Cross-Language Information Retrieval and Evaluation
Utaclir @ CLEF 2001 - Effects of Compound Splitting and N-Gram Techniques
CLEF '01 Revised Papers from the Second Workshop of the Cross-Language Evaluation Forum on Evaluation of Cross-Language Information Retrieval Systems
Monolingual Document Retrieval for European Languages
Information Retrieval
How Effective is Stemming and Decompounding for German Text Retrieval?
Information Retrieval
Context sensitive stemming for web search
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Restricted inflectional form generation in management of morphological keyword variation
Information Retrieval
A Mixed Method Lemmatization Algorithm Using a Hierarchy of Linguistic Identities (HOLI)
GoTAL '08 Proceedings of the 6th international conference on Advances in Natural Language Processing
Automatic Generation of Frequent Case Forms of Query Keywords in Text Retrieval
GoTAL '08 Proceedings of the 6th international conference on Advances in Natural Language Processing
Current research issues and trends in non-English Web searching
Information Retrieval
Does dictionary based bilingual retrieval work in a non-normalized index?
Information Processing and Management: an International Journal
Using a maximum entropy model to build segmentation lattices for MT
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
On document classification with self-organising maps
ICANNGA'09 Proceedings of the 9th international conference on Adaptive and natural computing algorithms
CLEF'10 Proceedings of the 2010 international conference on Multilingual and multimodal information access evaluation: cross-language evaluation forum
Is a morphologically complex language really that complex in full-text retrieval?
FinTAL'06 Proceedings of the 5th international conference on Advances in Natural Language Processing
Web traffic profiling and characterization
Proceedings of the Seventh Annual Workshop on Cyber Security and Information Intelligence Research
Interpretation of coordinations, compound generation, and result fusion for query variants
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Hi-index | 0.00 |
The present research studies the impact of decompounding and two different word normalization methods, stemming and lemmatization, on monolingual and bilingual retrieval. The languages in the monolingual runs are English, Finnish, German and Swedish. The source language of the bilingual runs is English, and the target languages are Finnish, German and Swedish. In the monolingual runs, retrieval in a lemmatized compound index gives almost as good results as retrieval in a decompounded index, but in the bilingual runs differences are found: retrieval in a lemmatized decompounded index performs better than retrieval in a lemmatized compound index. The reason for the poorer performance of indexes without decompounding in bilingual retrieval is the difference between the source language and target languages: phrases are used in English, while compounds are used instead of phrases in Finnish, German and Swedish. No remarkable performance differences could be found between stemming and lemmatization.