Verbs are where all the action lies: experiences of shallow parsing of a morphologically rich language

  • Authors:
  • Harshada Gune;Mugdha Bapat;Mitesh M. Khapra;Pushpak Bhattacharyya

  • Affiliations:
  • Indian Institute of Technology Bombay;Indian Institute of Technology Bombay;Indian Institute of Technology Bombay;Indian Institute of Technology Bombay

  • Venue:
  • COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Verb suffixes and verb complexes of morphologically rich languages carry a lot of information. We show that this information if harnessed for the task of shallow parsing can lead to dramatic improvements in accuracy for a morphologically rich language- Marathi. The crux of the approach is to use a powerful morphological analyzer backed by a high coverage lexicon to generate rich features for a CRF based sequence classifier. Accuracy figures of 94% for Part of Speech Tagging and 97% for Chunking using a modestly sized corpus (20K words) vindicate our claim that for morphologically rich languages linguistic insight can obviate the need for large amount of annotated corpora.