Part-of-speech tagging based on hidden Markov model assuming joint independence

  • Authors:
  • Sang-Zoo Lee;Jun-ichi Tsujii;Hae-Chang Rim

  • Affiliations:
  • University of Tokyo, Hongo, Tokyo, Japan;University of Tokyo, Hongo, Tokyo, Japan;Korea University, Anam-Dong, Korea

  • Venue:
  • ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we present part-of-speech taggers based on hidden Markov models, which adopt a less strict Markov assumption to consider rich contexts. In models whose parameters are very specific like lexicalized ones, sparse-data problem is very serious and also conditional probabilities tend to be estimated unreliably. To overcome data-sparseness, a simplified version of the well-known back-off smoothing method is used. To mitigate unreliable estimation problem, our models assume joint independence instead of conditional independence because joint probabilities have the same degree of estimation reliability. In experiments for the Brown corpus, models with rich contexts achieve relatively high accuracy and some models assuming joint independence show better results than the corresponding HMMs.