Building a Chinese shallow parsed treebank for collocation extraction

  • Authors:
  • Li Baoli;Lu Qin;Li Yin

  • Affiliations:
  • Department of Computer Science and Technology, Peking University, Beijing, P.R. China;Department of Computing, The Hong Kong Polytechnic University, Kowloon, Hong Kong;Department of Computing, The Hong Kong Polytechnic University, Kowloon, Hong Kong

  • Venue:
  • CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

To automatically extract Chinese collocations and build a large-scale collocation bank, we are developing a one-million-word Chinese shallow parsed treebank. The treebank can be used not only as a training set for our shallow parser, but also as processed data from which collocations are extracted. This paper presents several issues related to this on-going project, such as our definition of shallow parsing used in Chinese collocation extraction, guideline preparation, and quality control.