Improving POS tagging for ungrammatical phrases

  • Authors:
  • Daisuke Ninomiya;Maxim Mozgovoy

  • Affiliations:
  • The University of Aizu, Tsuruga, Ikki-machi, Aizuwakamatsu, Fukushima, Japan;The University of Aizu, Tsuruga, Ikki-machi, Aizuwakamatsu, Fukushima, Japan

  • Venue:
  • Proceedings of the 2012 Joint International Conference on Human-Centered Computer Environments
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Modern part-of-speech (POS) tagging tools can provide high quality markup for grammatically correct documents, but ungrammatical sentences can be challenging for them. In the present paper we study the problem of POS-tagging for the texts that contain grammatical errors, and show how POS-taggers can be improved for the use in this context. Specifically, we propose to include ungrammatical POS-tagged sentences into the text corpus used to train a tagger (presumably, a tagger is based on a certain variation of machine learning).