On designing an automated Malaysian stemmer for the Malay language (poster session)
IRAL '00 Proceedings of the fifth international workshop on on Information retrieval with Asian languages
Arabic Stemming Without A Root Dictionary
ITCC '05 Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC'05) - Volume I - Volume 01
Stemming Indonesian: A confix-stripping approach
ACM Transactions on Asian Language Information Processing (TALIP)
A basis for information retrieval in context
ACM Transactions on Information Systems (TOIS)
Handwritten Cursive Jawi Character Recognition: A Survey
CGIV '08 Proceedings of the 2008 Fifth International Conference on Computer Graphics, Imaging and Visualisation
Hi-index | 0.00 |
The Malay language may be written using either Roman or Jawi characters. Most Malay stemmers cover only Roman (Rumi ) affixes. This paper proposes a stemmer for Jawi characters using two sets of rules in Jawi: one set of rules is used to stem various forms of derived words, and another set is used to replace the use of a dictionary by producing the root word for each derivative. This stemmer has been tested using 1185 derived words consisting of prefix, circumfix, suffix, and infix. The results show that 84.89% of Jawi root words have been successfully stemmed.