Lexicalized hidden Markov models for part-of-speech tagging

  • Authors:
  • Sang-Zoo Lee;Jun-ichi Tsujii;Hae-Chang Rim

  • Affiliations:
  • University of Tokyo, Tokyo, Japan;University of Tokyo, Tokyo, Japan;Korea University, Seoul, Korea

  • Venue:
  • COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Since most previous works for HMM-based tagging consider only part-of-speech information in contexts, their models cannot utilize lexical information which is crucial for resolving some morphological ambiguity. In this paper we introduce uniformly lexicalized HMMs for part-of-speech tagging in both English and Korean. The lexicalized models use a simplified back-off smoothing technique to overcome data sparseness. In experiments, lexicalized models achieve higher accuracy than non-lexicalized models and the back-off smoothing method mitigates data sparseness better than simple smoothing methods.