Incremental joint approach to word segmentation, POS tagging, and dependency parsing in Chinese

Authors:
Jun Hatori;Takuya Matsuzaki;Yusuke Miyao;Jun'ichi Tsujii
Affiliations:
University of Tokyo, Hongo, Bunkyo, Tokyo, Japan;National Institute of Informatics, Hitotsubashi, Chiyoda, Tokyo, Japan;National Institute of Informatics, Hitotsubashi, Chiyoda, Tokyo, Japan;Microsoft Research Asia, Beijing, P.R. China
Venue:
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Year:
2012

Citing 9
Cited 2

A stochastic finite-state word-segmentation algorithm for Chinese

Computational Linguistics
A maximum entropy Chinese character-based parser

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Incremental parsing with the perceptron algorithm

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
An error-driven word-character hybrid model for joint Chinese word segmentation and POS tagging

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Dynamic programming for linear-time incremental parsing

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
A fast decoder for joint word segmentation and POS-tagging using a single discriminative model

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
A stacked sub-word model for joint Chinese word segmentation and part-of-speech tagging

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Transition-based dependency parsing with rich non-local features

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Joint models for Chinese POS tagging and dependency parsing

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing

Multilingual joint parsing of syntactic and semantic dependencies with a latent variable model

Computational Linguistics
Joint Optimization for Chinese POS Tagging and Dependency Parsing

IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose the first joint model for word segmentation, POS tagging, and dependency parsing for Chinese. Based on an extension of the incremental joint model for POS tagging and dependency parsing (Hatori et al., 2011), we propose an efficient character-based decoding method that can combine features from state-of-the-art segmentation, POS tagging, and dependency parsing models. We also describe our method to align comparable states in the beam, and how we can combine features of different characteristics in our incremental framework. In experiments using the Chinese Treebank (CTB), we show that the accuracies of the three tasks can be improved significantly over the baseline models, particularly by 0.6% for POS tagging and 2.4% for dependency parsing. We also perform comparison experiments with the partially joint models.