On designing an automated Malaysian stemmer for the Malay language (poster session)

  • Authors:
  • Sock Yin Tai;Cheng Soon Ong;Noor Aida Abullah

  • Affiliations:
  • Software Lab, MIMOS Berhad, Technology Park Malaysia, 57000 Kuala Lumpur, Malaysia;Software Lab, MIMOS Berhad, Technology Park Malaysia, 57000 Kuala Lumpur, Malaysia;Software Lab, MIMOS Berhad, Technology Park Malaysia, 57000 Kuala Lumpur, Malaysia

  • Venue:
  • IRAL '00 Proceedings of the fifth international workshop on on Information retrieval with Asian languages
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Online and interactive information retrieval systems are likely to play an increasing role in the Malay Language community. To facilitate and automate the process of matching morphological term variants, a stemmer focusing on common affix removal algorithms is proposed as part of the design of an information retrieval system for the Malay Language. Stemming is a morphological process of normalizing word tokens down to their essential roots. The proposed stemmer strips prefixes and suffixes off the word. The experiment conducted with web sites selected from the World Wide Web has exhibited substantial improvements in the number of words indexed.