Unsupervised Russian POS tagging with appropriate context

  • Authors:
  • Li Yang;Erik Peterson;John Chen;Yana Petrova;Rohini Srihari

  • Affiliations:
  • Janya Inc., Amherst, NY;Janya Inc., Amherst, NY;Janya Inc., Amherst, NY;Department of Linguistics, State University of New York at Buffalo, Buffalo, NY;Janya Inc., Amherst, NY

  • Venue:
  • TSD'11 Proceedings of the 14th international conference on Text, speech and dialogue
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

While adopting the contextualized hidden Markov model (CHMM) framework for unsupervised Russian POS tagging, we investigate the possibility of utilizing the left, right, and unambiguous context in the CHMM framework. We propose a backoff smoothing method that incorporates all three types of context into the transition probability estimation during the expectation-maximization process. The resulting model with this new method achieves overall and disambiguation accuracies comparable to a CHMM using the classic backoff smoothing method for HMM-based POS tagging from [17].