Using statistical testing in the evaluation of retrieval experiments
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Experiments in multilingual information retrieval using the SPIDER system
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
A language modeling approach to information retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Stemming methodologies over individual query words for an Arabic information retrieval system
Journal of the American Society for Information Science
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A hidden Markov model information retrieval system
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Information retrieval as statistical translation
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating a probabilistic model for cross-lingual information retrieval
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
Arabic finite-state morphological analysis and generation
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Should we translate the documents or the queries in cross-language information retrieval?
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Improved statistical alignment models
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Improving stemming for Arabic information retrieval: light stemming and co-occurrence analysis
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Arabic morphological analysis techniques: a comprehensive survey
Journal of the American Society for Information Science and Technology
Dictionary-based techniques for cross-language information retrieval
Information Processing and Management: an International Journal - Special issue: Cross-language information retrieval
Character contiguity in N-gram-based word matching: the case for Arabic text searching
Information Processing and Management: an International Journal
A translation model for sentence retrieval
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
A novel Arabic lemmatization algorithm
Proceedings of the second workshop on Analytics for noisy unstructured text data
Towards an error-free Arabic stemming
Proceedings of the 2nd ACM workshop on Improving non english web searching
Information Retrieval
Adapting the JIRS Passage Retrieval System to the Arabic Language
CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing
The impact of morphological stemming on Arabic mention detection and coreference resolution
Semitic '05 Proceedings of the ACL Workshop on Computational Approaches to Semitic Languages
Preliminary lexical framework for English-Arabic semantic resource construction
Semitic '04 Proceedings of the Workshop on Computational Approaches to Arabic Script-based Languages
Semitic '04 Proceedings of the Workshop on Computational Approaches to Arabic Script-based Languages
Multilingual pseudo-relevance feedback: performance study of assisting languages
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Improving Arabic information retrieval system using N-gram method
WSEAS Transactions on Computers
Matching meaning for cross-language information retrieval
Information Processing and Management: an International Journal
A framework for retrieving Arabic documents based on queries written in Arabic slang language
Journal of Information Science
Hi-index | 0.00 |
This work evaluates a few search strategies for Arabic monolingual and cross-lingual retrieval, using the TREC Arabic corpus as the test-bed. The release by NIST in 2001 of an Arabic corpus of nearly 400k documents with both monolingual and cross-lingual queries and relevance judgments has been a new enabler for empirical studies. Experimental results show that spelling normalization and stemming can significantly improve Arabic monolingual retrieval. Character tri-grams from stems improved retrieval modestly on the test corpus, but the improvement is not statistically significant. To further improve retrieval, we propose a novel thesaurus-based technique. Different from existing approaches to thesaurus-based retrieval, ours formulates word synonyms as probabilistic term translations that can be automatically derived from a parallel corpus. Retrieval results show that the thesaurus can significantly improve Arabic monolingual retrieval. For cross-lingual retrieval (CLIR), we found that spelling normalization and stemming have little impact.