A Dictionary Mechanism for Chinese Word Segmentation Based on the Finite Automata

  • Authors:
  • Wu Yang;Liyun Ren;Rong Tang

  • Affiliations:
  • -;-;-

  • Venue:
  • IALP '10 Proceedings of the 2010 International Conference on Asian Language Processing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Dictionary mechanism is the basis of Chinese word segmentation, and its quality directly affects the speed and efficiency of Chinese word segmentation. In existing dictionary mechanisms, there are such shortages as space wasting, low efficiency, and difficult maintenance, and therefore, how to establish an effective mechanism is an urgent problem for Chinese word segmentation. In this paper, the idea of finite-state automaton is firstly studied, then a new kind of dictionary mechanism is proposed to save space and improve the speed of Chinese word segmentation as possible, and finally, the performances of various dictionary mechanisms are analyzed with theoretical study and experimental comparison. The result shows that compared with other mechanisms, the dictionary mechanism based on finite-state automaton proposed in the paper improves in space complexity and time complexity.