Chinese and Japanese word segmentation using word-level and character-level information

  • Authors:
  • Tetsuji Nakagawa

  • Affiliations:
  • Oki Electric Industry Co., Ltd., Honmachi, Chuo-ku, Osaka, Japan

  • Venue:
  • COLING '04 Proceedings of the 20th international conference on Computational Linguistics
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we present a hybrid method for Chinese and Japanese word segmentation. Word-level information is useful for analysis of known words, while character-level information is useful for analysis of unknown words, and the method utilizes both these two types of information in order to effectively handle known and unknown words. Experimental results show that this method achieves high overall accuracy in Chinese and Japanese word segmentation