Word category disambiguation for malayalam: a language model approach

Authors:
T. Dinesh;V. Jayan;V. K. Bhadran
Affiliations:
Thiruvananthapuram, Kerala, India;Thiruvananthapuram, Kerala, India;Thiruvananthapuram, Kerala, India
Venue:
Proceedings of the Second International Conference on Computational Science, Engineering and Information Technology
Year:
2012

Citing 4
Cited 0

Word sense disambiguation and information retrieval

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
A general language model for information retrieval

Proceedings of the eighth international conference on Information and knowledge management
Development of a POS Tagger for Malayalam - An Experience

ARTCOM '09 Proceedings of the 2009 International Conference on Advances in Recent Technologies in Communication and Computing
SVM Based Part of Speech Tagger for Malayalam

ITC '10 Proceedings of the 2010 International Conference on Recent Trends in Information, Telecommunication and Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we are introducing a new method of word category disambiguation for Malayalam language. The proposed model is a supervised Machine learning system. It consists of a language model, which is trained by an annotated corpus of 10,000 words. This model checks the trigram possibility of occurrence of tag in the training corpus. To get better tagging result, we are using Morphological Analyzer and Named Entity Recognizer also in addition to the language model. We can improve the accuracy of the system by increasing the size of the annotated corpus. Although the experiments were performed on a small corpus, the results show that the statistical approach works well with a highly agglutinative language like Malayalam.