Recognizing unregistered names for Mandarin word identification

  • Authors:
  • Liang-Jyh Wang;Wei-Chuan Li;Chao-Huang Chang

  • Affiliations:
  • Industrial Technology Research Institute (ITRI), Hsinchu, Taiwan, R.O.C.;Industrial Technology Research Institute (ITRI), Hsinchu, Taiwan, R.O.C.;Industrial Technology Research Institute (ITRI), Hsinchu, Taiwan, R.O.C.

  • Venue:
  • COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 4
  • Year:
  • 1992

Quantified Score

Hi-index 0.00

Visualization

Abstract

Word Identification has been an important and active issue in Chinese Natural Language Processing. In this paper, a new mechanism, based on the concept of sublanguage, is proposed for identifying unknown words, especially personal names, in Chinese newspapers. The proposed mechanism includes title-driven name recognition, adaptive dynamic word formation, identification of 2-character and 3-character Chinese names without title. We will show the experimental results for two corpora and compare them with the results by the NTHU's statistic-based system, the only system that we know has attacked the same problem. The experimental results have shown significant improvements over the WI systems without the name identification capability.