Combining Language Modeling and Discriminative Classification for Word Segmentation

  • Authors:
  • Dekang Lin

  • Affiliations:
  • Google, Inc., Mountain View, USA 94043

  • Venue:
  • CICLing '09 Proceedings of the 10th International Conference on Computational Linguistics and Intelligent Text Processing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Generative language modeling and discriminative classification are two main techniques for Chinese word segmentation. Most previous methods have adopted one of the techniques. We present a hybrid model that combines the disambiguation power of language modeling and the ability of discriminative classifiers to deal with out-of-vocabulary words. We show that the combined model achieves 9% error reduction over the discriminative classifier alone.