Adding morphological information to a connectionist part-of-speech tagger
CAEPIA'09 Proceedings of the Current topics in artificial intelligence, and 13th conference on Spanish association for artificial intelligence
Hi-index | 0.00 |
This paper presents a neural network based part-of-speechtagger that learns to assign correct part-of-speechtags to the words in a sentence. A multilayer perceptron(MLP) network with three-layers is used. The MLP-taggeris trained with error back-propagation learning algorithm.The representation scheme for the input and output of thenetwork is adapted from Ma et al. [6]. The tagger is trainedon SUSANNE English tagged-corpus consisting of 156,622words. The MLP-tagger is trained using 85% of the corpus.Based on the tag mappings learned, the MLP-taggerdemonstrated an accuracy of 90.04% on test data that alsoincluded words unseen during the training. Results from ourexperiments suggest that the MLP-tagger combined withthe representation scheme adopted here could be a bettersubstitute for traditional tagging approaches. This methodshows promise for addressing parts-of-speech tagging problemfor Indian language text considering the fact that mostof the Indian language corpora, especially tagged ones, arestill considerably small in size.