CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
SVM based Manipuri POS tagging using SVM based identified reduplicated MWE (RMWE)
Proceedings of the CUBE International Information Technology Conference
Word category disambiguation for malayalam: a language model approach
Proceedings of the Second International Conference on Computational Science, Engineering and Information Technology
Hi-index | 0.00 |
This paper presents the building of part-of-speech Tagger for Malayalam Language using Support Vector Machine (SVM). POS tagger plays an important role in Natural language applications like speech recognition, natural language parsing, information retrieval and information extraction. This supervised machine learning POS tagging approach requires a large amount of annotated training corpus to tag properly. At initial stage of POS-tagging for Malayalam, the model is trained with a very limited resource of annotated corpus. We tried to maximize the performance with this a substantial amount of annotated corpus. The objective of this project was to identify the ambiguities in Malayalam lexical items and develop an efficient and accurate POS Tagger. We have developed our own tagset for training and testing the POS-tagger generators. The present tagset consists of 29 tags. A corpus size of one hundred and eighty thousand words was used for training and testing the accuracy of the tagger generators. We found that the result obtained was more efficient and accurate compared with earlier methods for Malayalam POS tagging.