Incremental joint approach to word segmentation, POS tagging, and dependency parsing in Chinese

  • Authors:
  • Jun Hatori;Takuya Matsuzaki;Yusuke Miyao;Jun'ichi Tsujii

  • Affiliations:
  • University of Tokyo, Hongo, Bunkyo, Tokyo, Japan;National Institute of Informatics, Hitotsubashi, Chiyoda, Tokyo, Japan;National Institute of Informatics, Hitotsubashi, Chiyoda, Tokyo, Japan;Microsoft Research Asia, Beijing, P.R. China

  • Venue:
  • ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose the first joint model for word segmentation, POS tagging, and dependency parsing for Chinese. Based on an extension of the incremental joint model for POS tagging and dependency parsing (Hatori et al., 2011), we propose an efficient character-based decoding method that can combine features from state-of-the-art segmentation, POS tagging, and dependency parsing models. We also describe our method to align comparable states in the beam, and how we can combine features of different characteristics in our incremental framework. In experiments using the Chinese Treebank (CTB), we show that the accuracies of the three tasks can be improved significantly over the baseline models, particularly by 0.6% for POS tagging and 2.4% for dependency parsing. We also perform comparison experiments with the partially joint models.