Word lattice reranking for Chinese word segmentation and part-of-speech tagging

  • Authors:
  • Wenbin Jiang;Haitao Mi;Qun Liu

  • Affiliations:
  • Chinese Academy of Sciences, Beijing, China and Graduate University of Chinese Academy of Sciences, Beijing, China;Chinese Academy of Sciences, Beijing, China and Graduate University of Chinese Academy of Sciences, Beijing, China;Chinese Academy of Sciences, Beijing, China

  • Venue:
  • COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we describe a new reranking strategy named word lattice reranking, for the task of joint Chinese word segmentation and part-of-speech (POS) tagging. As a derivation of the forest reranking for parsing (Huang, 2008), this strategy reranks on the pruned word lattice, which potentially contains much more candidates while using less storage, compared with the traditional n-best list reranking. With a perceptron classifier trained with local features as the baseline, word lattice reranking performs reranking with non-local features that can't be easily incorporated into the perceptron baseline. Experimental results show that, this strategy achieves improvement on both segmentation and POS tagging, above the perceptron baseline and the n-best list reranking.