Software—Practice & Experience
Method for evaluation of stemming algorithms based on error counting
Journal of the American Society for Information Science
Corpus-based stemming using cooccurrence of word variants
ACM Transactions on Information Systems (TOIS)
Improving stemming for Arabic information retrieval: light stemming and co-occurrence analysis
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
On arabic search: improving the retrieval effectiveness via a light stemming approach
Proceedings of the eleventh international conference on Information and knowledge management
Automatic Language-Specific Stemming in Information Retrieval
CLEF '00 Revised Papers from the Workshop of Cross-Language Evaluation Forum on Cross-Language Information Retrieval and Evaluation
Arabic finite-state morphological analysis and generation
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Broken plural detection for arabic information retrieval
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Arabic Stemming Without A Root Dictionary
ITCC '05 Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC'05) - Volume I - Volume 01
Unsupervised learning of Arabic stemming using a parallel corpus
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Language model based arabic word segmentation
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Building a shallow Arabic Morphological Analyzer in one day
SEMITIC '02 Proceedings of the ACL-02 workshop on Computational approaches to semitic languages
Ontology based annotation of text segments
Proceedings of the 2007 ACM symposium on Applied computing
A novel Arabic lemmatization algorithm
Proceedings of the second workshop on Analytics for noisy unstructured text data
Automatic acquisition of inflectional lexica for morphological normalisation
Information Processing and Management: an International Journal
Introduction to Information Retrieval
Introduction to Information Retrieval
KP-Miner: A keyphrase extraction system for English and Arabic documents
Information Systems
Towards an error-free Arabic stemming
Proceedings of the 2nd ACM workshop on Improving non english web searching
Ontology learning from domain specific web documents
International Journal of Metadata, Semantics and Ontologies
Ontology based Text Annotation --OnTeA
Proceedings of the 2007 conference on Information Modelling and Knowledge Bases XVIII
Automatic tagging of Arabic text: from raw text to base phrase chunks
HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
The impact of morphological stemming on Arabic mention detection and coreference resolution
Semitic '05 Proceedings of the ACL Workshop on Computational Approaches to Semitic Languages
Assessing the impact of stemming accuracy on information retrieval
PROPOR'10 Proceedings of the 9th international conference on Computational Processing of the Portuguese Language
Stemming arabic conjunctions and prepositions
SPIRE'05 Proceedings of the 12th international conference on String Processing and Information Retrieval
A corpus based approach for the automatic creation of arabic broken plural dictionaries
CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
Hi-index | 0.00 |
Stemming is a key step in most text mining and information retrieval applications. Information extraction, semantic annotation, as well as ontology learning are but a few examples where using a stemmer is a must. While the use of light stemmers in Arabic texts has proven highly effective for the task of information retrieval, this class of stemmers falls short of providing the accuracy required by many text mining applications. This can be attributed to the fact that light stemmers employ a set of rules that they apply indiscriminately and that they do not address stemming of broken plurals at all, even though this class of plurals is very commonly used in Arabic texts. The goal of this work is to overcome these limitations. The evaluation of the work shows that it significantly improves stemming accuracy. It also shows that by improving stemming accuracy, tasks such as automatic annotation and keyphrase extraction can also be significantly improved.