Design of chinese word segmentation system based on improved chinese converse dictionary and reverse maximum matching algorithm

  • Authors:
  • Liyi Zhang;Yazi Li;Jian Meng

  • Affiliations:
  • Center for Studies of Information Resources, Wuhan University;Center for Studies of Information Resources, Wuhan University;Center for Studies of Information Resources, Wuhan University

  • Venue:
  • WISE'06 Proceedings of the 7th international conference on Web Information Systems
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

The increasing interest in cross-lingual and multilingual information retrieval has posed a great challenge of designing accurate information retrieval systems for Asian languages such as Chinese, Thai and Japanese. Word segmentation is one of the most important pre-processes of Chinese information processing. This paper reviews some popular word segmentation algorithms. Based on an improved Converse Chinese dictionary and an optimized reverse maximum matching algorithm, a Chinese word segmentation system is proposed. Experiments are carried out to demonstrate the substantially ameliorated accuracy and speed of the system.