Systran's Chinese word segmentation

  • Authors:
  • Jin Yang;Jean Senellart;Remi Zajac

  • Affiliations:
  • SYSTRAN Software, Inc., San Diego, CA;SYSTRAN S.A., Soisy-sous-Montmorency, France;SYSTRAN Software, Inc., San Diego, CA

  • Venue:
  • SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

SYSTRAN's Chinese word segmentation is one important component of its Chinese-English machine translation system. The Chinese word segmentation module uses a rule-based approach, based on a large dictionary and fine-grained linguistic rules. It works on general-purpose texts from different Chinese-speaking regions, with comparable performance. SYSTRAN participated in the four open tracks in the First International Chinese Word Segmentation Bakeoff. This paper gives a general description of the segmentation module, as well as the results and analysis of its performance in the Bakeoff.