Development of a POS Tagger for Malayalam - An Experience

Authors:
Manju K.;Soumya S.;Sumam Mary Idicula
Affiliations:
-;-;-
Venue:
ARTCOM '09 Proceedings of the 2009 International Conference on Advances in Recent Technologies in Communication and Computing
Year:
2009

Citing 0
Cited 3

Second-order HMM for event extraction from short message

NLDB'10 Proceedings of the Natural language processing and information systems, and 15th international conference on Applications of natural language to information systems
Subject and object identification in Malayalam text

Proceedings of the International Conference on Advances in Computing, Communications and Informatics
Word category disambiguation for malayalam: a language model approach

Proceedings of the Second International Conference on Computational Science, Engineering and Information Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

A Parts of Speech tagger for Malayalam which uses a stochastic approach has been proposed. The tagger makes use of word frequencies and bigram statistics from a corpus. The morphological analyzer is used to generate a tagged corpus due to the unavailability of an annotated corpus in Malayalam. Although the experiments have been performed on a very small corpus, the results have shown that the statistical approach works well with a highly agglutinative language like Malayalam