Building bilingual microcomputer systems
Communications of the ACM
Viewing morphology as an inference process
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Using statistical testing in the evaluation of retrieval experiments
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Stemming algorithms: a case study for detailed evaluation
Journal of the American Society for Information Science - Special issue: evaluation of information retrieval systems
Viewing stemming as recall enhancement
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Corpus-based stemming using cooccurrence of word variants
ACM Transactions on Information Systems (TOIS)
Stemming methodologies over individual query words for an Arabic information retrieval system
Journal of the American Society for Information Science
A computational morphology system for Arabic
Semitic '98 Proceedings of the Workshop on Computational Approaches to Semitic Languages
ACM Transactions on Asian Language Information Processing (TALIP)
Arabic morphological analysis techniques: a comprehensive survey
Journal of the American Society for Information Science and Technology
Dictionary-based techniques for cross-language information retrieval
Information Processing and Management: an International Journal - Special issue: Cross-language information retrieval
On the development of name search techniques for Arabic
Journal of the American Society for Information Science and Technology
Stemming to improve translation lexicon creation form bitexts
Information Processing and Management: an International Journal
Design, implementation, and evaluation of a methodology for automatic stemmer generation
Journal of the American Society for Information Science and Technology
Towards an error-free Arabic stemming
Proceedings of the 2nd ACM workshop on Improving non english web searching
Query Translation and Expansion for Searching Normal and OCR-Degraded Arabic Text
CICLing '09 Proceedings of the 10th International Conference on Computational Linguistics and Intelligent Text Processing
A comparison of text-classification techniques applied to Arabic text
Journal of the American Society for Information Science and Technology
Capturing out-of-vocabulary words in Arabic text
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Semitic '04 Proceedings of the Workshop on Computational Approaches to Arabic Script-based Languages
Feature reduction techniques for Arabic text categorization
Journal of the American Society for Information Science and Technology
Software localization: the challenging aspects of Arabic to the localization process (Arabization)
SE '08 Proceedings of the IASTED International Conference on Software Engineering
Worldwide accessibility to Yizkor books
NGITS'09 Proceedings of the 7th international conference on Next generation information technologies and systems
A comparison study of some Arabic root finding algorithms
Journal of the American Society for Information Science and Technology
An accuracy-enhanced light stemmer for arabic text
ACM Transactions on Speech and Language Processing (TSLP)
ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
Stemming arabic conjunctions and prepositions
SPIRE'05 Proceedings of the 12th international conference on String Processing and Information Retrieval
Fast yet rich morphological analysis
FSMNLP '11 Proceedings of the 9th International Workshop on Finite State Methods and Natural Language Processing
Rational kernels for arabic text classification
SLSP'13 Proceedings of the First international conference on Statistical Language and Speech Processing
Hi-index | 0.00 |
The inflectional structure of a word impacts the retrieval accuracy of information retrieval systems of Latin-based languages. We present two stemming algorithms for Arabic information retrieval systems. We empirically investigate the effectiveness of surface-based retrieval. This approach degrades retrieval precision since Arabic is a highly inflected language. Accordingly, we propose root-based retrieval. We notice a statistically significant improvement over the surface-based approach. Many variant word senses are based on an identical root; thus, the root-based algorithm creates invalid conflation classes that result in an ambiguous query which degrades the performance by adding extraneous terms. To resolve ambiguity, we propose a novel light-stemming algorithm for Arabic texts. This automatic rule-based stemming algorithm is not as aggressive as the root extraction algorithm. We show that the light stemming algorithm significantly outperforms the root-based algorithm. We also show that a significant improvement in retrieval precision can be achieved with light inflectional analysis of Arabic words.