Stemming arabic conjunctions and prepositions

Authors:
Abdusalam F. A. Nwesri;S. M. M. Tahaghoghi;Falk Scholer
Affiliations:
School of Computer Science and Information Technology, RMIT University, Melbourne, Australia;School of Computer Science and Information Technology, RMIT University, Melbourne, Australia;School of Computer Science and Information Technology, RMIT University, Melbourne, Australia
Venue:
SPIRE'05 Proceedings of the 12th international conference on String Processing and Information Retrieval
Year:
2005

Citing 6
Cited 2

Term selection for searching printed Arabic

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Improving stemming for Arabic information retrieval: light stemming and co-occurrence analysis

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
On arabic search: improving the retrieval effectiveness via a light stemming approach

Proceedings of the eleventh international conference on Information and knowledge management
Strength and similarity of affix removal stemming algorithms

ACM SIGIR Forum
Arabic morphological analysis techniques: a comprehensive survey

Journal of the American Society for Information Science and Technology
Information retrieval system evaluation: effort, sensitivity, and reliability

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval

An accuracy-enhanced light stemmer for arabic text

ACM Transactions on Speech and Language Processing (TSLP)
Benchmarking and assessing the performance of Arabic stemmers

Journal of Information Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

Arabic is the fourth most widely spoken language in the world, and is characterised by a high rate of inflection. To cater for this, most Arabic information retrieval systems incorporate a stemming stage. Most existing Arabic stemmers are derived from English equivalents; however, unlike English, most affixes in Arabic are difficult to discriminate from the core word. Removing incorrectly identified affixes sometimes results in a valid but incorrect stem, and in most cases reduces retrieval precision. Conjunctions and prepositions form an interesting class of these affixes. In this work, we present novel approaches for dealing with these affixes. Unlike previous approaches, our approaches focus on retaining valid Arabic core words, while maintaining high retrieval performance.