Chunking-based Chinese word tokenization

  • Authors:
  • GuoDong Zhou

  • Affiliations:
  • Institute for Infocomm Research, Singapore

  • Venue:
  • SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper introduces a Chinese word tokenization system through HMM-based chunking. Experiments show that such a system can well deal with the unknown word problem in Chinese word tokenization.