Joint models for Chinese POS tagging and dependency parsing

  • Authors:
  • Zhenghua Li;Min Zhang;Wanxiang Che;Ting Liu;Wenliang Chen;Haizhou Li

  • Affiliations:
  • Harbin Institute of Technology, China;Institute for Infocomm Research, Singapore;Harbin Institute of Technology, China;Harbin Institute of Technology, China;Institute for Infocomm Research, Singapore;Institute for Infocomm Research, Singapore

  • Venue:
  • EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Part-of-speech (POS) is an indispensable feature in dependency parsing. Current research usually models POS tagging and dependency parsing independently. This may suffer from error propagation problem. Our experiments show that parsing accuracy drops by about 6% when using automatic POS tags instead of gold ones. To solve this issue, this paper proposes a solution by jointly optimizing POS tagging and dependency parsing in a unique model. We design several joint models and their corresponding decoding algorithms to incorporate different feature sets. We further present an effective pruning strategy to reduce the search space of candidate POS tags, leading to significant improvement of parsing speed. Experimental results on Chinese Penn Treebank 5 show that our joint models significantly improve the state-of-the-art parsing accuracy by about 1.5%. Detailed analysis shows that the joint method is able to choose such POS tags that are more helpful and discriminative from parsing viewpoint. This is the fundamental reason of parsing accuracy improvement.