Blending segmentation with tagging in Chinese language corpus processing

  • Authors:
  • Zhou Qiang;Yu Shiwen

  • Affiliations:
  • Peking University, Beijing, P.R. China;Peking University, Beijing, P.R. China

  • Venue:
  • COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
  • Year:
  • 1994

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper poses a new method for Chinese language corpus processing. Unlike the past researches, our approach has following charactericstics: it blends segmenation with tagging and integrates rule-based approach with statistics-based one in grammatical disambiguation. The principal ideas presented in the paper are incorporated in the development of a Chinese corpus processing system. Experimental results prove that the overall accuracy for segmentation is 97.68% and that for tagging is 94.55% in about 400,000 Chinese characters.