Word category disambiguation for malayalam: a language model approach

  • Authors:
  • T. Dinesh;V. Jayan;V. K. Bhadran

  • Affiliations:
  • Thiruvananthapuram, Kerala, India;Thiruvananthapuram, Kerala, India;Thiruvananthapuram, Kerala, India

  • Venue:
  • Proceedings of the Second International Conference on Computational Science, Engineering and Information Technology
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we are introducing a new method of word category disambiguation for Malayalam language. The proposed model is a supervised Machine learning system. It consists of a language model, which is trained by an annotated corpus of 10,000 words. This model checks the trigram possibility of occurrence of tag in the training corpus. To get better tagging result, we are using Morphological Analyzer and Named Entity Recognizer also in addition to the language model. We can improve the accuracy of the system by increasing the size of the annotated corpus. Although the experiments were performed on a small corpus, the results show that the statistical approach works well with a highly agglutinative language like Malayalam.