Training Global Linear Models for Chinese Word Segmentation

Authors:
Dong Song;Anoop Sarkar
Affiliations:
School of Computing Science, Simon Fraser University, Burnaby, Canada V5A1S6;School of Computing Science, Simon Fraser University, Burnaby, Canada V5A1S6
Venue:
Canadian AI '09 Proceedings of the 22nd Canadian Conference on Artificial Intelligence: Advances in Artificial Intelligence
Year:
2009

Citing 8
Cited 0

A stochastic finite-state word-segmentation algorithm for Chinese

Computational Linguistics
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
EXPONENTIATED GRADIENT VERSUS GRADIENT DESCENT FOR LINEAR PREDICTORS

EXPONENTIATED GRADIENT VERSUS GRADIENT DESCENT FOR LINEAR PREDICTORS
Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
The first international Chinese word segmentation Bakeoff

SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
Incremental parsing with the perceptron algorithm

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Exponentiated gradient algorithms for log-linear structured prediction

Proceedings of the 24th international conference on Machine learning
Subword-based tagging by conditional random fields for Chinese word segmentation

NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper examines how one can obtain state of the art Chinese word segmentation using global linear models. We provide experimental comparisons that give a detailed road-map for obtaining state of the art accuracy on various datasets. In particular, we compare the use of reranking with full beam search; we compare various methods for learning weights for features that are full sentence features, such as language model features; and, we compare an Averaged Perceptron global linear model with the Exponentiated Gradient max-margin algorithm.