Chinese word segmentation as LMR tagging

  • Authors:
  • Nianwen Xue;Libin Shen

  • Affiliations:
  • University of Pennsylvania, Philadelphia, PA;University of Pennsylvania, Philadelphia, PA

  • Venue:
  • SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we present Chinese word segmentation algorithms based on the so-called LMR tagging. Our LMR taggers are implemented with the Maximum Entropy Markov Model and we then use Transformation-Based Learning to combine the results of the two LMR taggers that scan the input in opposite directions. Our system achieves F-scores of 95.9% and 91.6% on the Academia Sinica corpus and the Hong Kong City University corpus respectively.