Evaluation of perstem: a simple and efficient stemming algorithm for Persian

Authors:
Amir Hossein Jadidinejad;Fariborz Mahmoudi;Jon Dehdari
Affiliations:
Electrical and Computer Engineering Department, Islamic Azad University;Electrical and Computer Engineering Department, Islamic Azad University;Department of Linguistics, The Ohio State University
Venue:
CLEF'09 Proceedings of the 10th cross-language evaluation forum conference on Multilingual information access evaluation: text retrieval experiments
Year:
2009

Citing 8
Cited 1

Combining the language model and inference network approaches to retrieval

Information Processing and Management: an International Journal - Special issue: Bayesian networks and information retrieval
A Stemming Algorithm for the Farsi Language

ITCC '05 Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC'05) - Volume I - Volume 01
Introduction to a new Farsi stemmer

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Hamshahri: A standard Persian text collection

Knowledge-Based Systems
Persian Language, Is Stemming Efficient?

DEXA '09 Proceedings of the 2009 20th International Workshop on Database and Expert Systems Application
CLEF 2008: ad hoc track overview

CLEF'08 Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access
Cross language experiments at Persian@CLEF 2008

CLEF'08 Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access
CLEF 2009 ad hoc track overview: TEL and Persian tasks

CLEF'09 Proceedings of the 10th cross-language evaluation forum conference on Multilingual information access evaluation: text retrieval experiments

CLEF 2009 ad hoc track overview: TEL and Persian tasks

CLEF'09 Proceedings of the 10th cross-language evaluation forum conference on Multilingual information access evaluation: text retrieval experiments

Quantified Score

Hi-index	0.00

Visualization

Abstract

Persian is a challenging language in the field of NLP. Rightto-left orthography, complex morphology, complicated grammatical rules, and different forms of letters make it an interesting language for NLP research. In this paper we measure the effectiveness of a simple and efficient stemming algorithm, Perstem, on Persian information retrieval. Our experiments on the Hamshahri corpus at CLEF2009 show that the Perstem algorithm greatly improved both precision (+91%) and recall (+43%).