A fast decoder for joint word segmentation and POS-tagging using a single discriminative model

Authors:
Yue Zhang;Stephen Clark
Affiliations:
University of Cambridge Computer Laboratory, Cambridge, UK;University of Cambridge Computer Laboratory, Cambridge, UK
Venue:
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Year:
2010

Citing 8
Cited 8

Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Incremental parsing with the perceptron algorithm

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Multilevel coarse-to-fine PCFG parsing

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
A hybrid approach to word segmentation and POS tagging

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Word lattice reranking for Chinese word segmentation and part-of-speech tagging

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Classifying chart cells for quadratic complexity context-free inference

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
A dual-layer CRFs based joint decoding method for cascaded segmentation and labeling tasks

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
An error-driven word-character hybrid model for joint Chinese word segmentation and POS tagging

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1

Syntactic processing using the generalized perceptron and beam search

Computational Linguistics
A stacked sub-word model for joint Chinese word segmentation and part-of-speech tagging

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Reducing approximation and estimation errors for Chinese lexical processing with heterogeneous annotations

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Incremental joint approach to word segmentation, POS tagging, and dependency parsing in Chinese

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Unsupervized word segmentation: the case for Mandarin Chinese

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Iterative annotation transformation with predict-self reestimation for Chinese word segmentation

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Joint Chinese word segmentation, POS tagging and parsing

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Exploiting chunk-level features to improve phrase chunking

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

We show that the standard beam-search algorithm can be used as an efficient decoder for the global linear model of Zhang and Clark (2008) for joint word segmentation and POS-tagging, achieving a significant speed improvement. Such decoding is enabled by: (1) separating full word features from partial word features so that feature templates can be instantiated incrementally, according to whether the current character is separated or appended; (2) deciding the POS-tag of a potential word when its first character is processed. Early-update is used with perceptron training so that the linear model gives a high score to a correct partial candidate as well as a full output. Effective scoring of partial structures allows the decoder to give high accuracy with a small beam-size of 16. In our 10-fold cross-validation experiments with the Chinese Tree-bank, our system performed over 10 times as fast as Zhang and Clark (2008) with little accuracy loss. The accuracy of our system on the standard CTB 5 test was competitive with the best in the literature.