An IBM-PC environment for Chinese corpus analysis

  • Authors:
  • Robert Wing Pong Luk

  • Affiliations:
  • City Polytechnic of Hong Kong

  • Venue:
  • COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
  • Year:
  • 1994

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes a set of computer programs for Chinese corpus analysis. These programs include (1) extraction of different characters, bigrams and words; (2) word segmentation based on bigram, maximal-matching and the combined technique; (3) identification of special terms; (4) Chinese concordancing; (5) compiling collocation statistics and (6) evaluation utilities. These programs run on the IBM-PC and batch programs co-ordinate the use of these programs.