Word sense disambiguation and information retrieval
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
A general language model for information retrieval
Proceedings of the eighth international conference on Information and knowledge management
Development of a POS Tagger for Malayalam - An Experience
ARTCOM '09 Proceedings of the 2009 International Conference on Advances in Recent Technologies in Communication and Computing
SVM Based Part of Speech Tagger for Malayalam
ITC '10 Proceedings of the 2010 International Conference on Recent Trends in Information, Telecommunication and Computing
Hi-index | 0.00 |
In this paper we are introducing a new method of word category disambiguation for Malayalam language. The proposed model is a supervised Machine learning system. It consists of a language model, which is trained by an annotated corpus of 10,000 words. This model checks the trigram possibility of occurrence of tag in the training corpus. To get better tagging result, we are using Morphological Analyzer and Named Entity Recognizer also in addition to the language model. We can improve the accuracy of the system by increasing the size of the annotated corpus. Although the experiments were performed on a small corpus, the results show that the statistical approach works well with a highly agglutinative language like Malayalam.