Frequent words' grammar information in Chinese chunking

  • Authors:
  • Quan Qi;Li Liu;Yue Chen

  • Affiliations:
  • School of Computer Science, Beijing Institute of Technology, Beijing, China;School of Computer Science, Beijing Institute of Technology, Beijing, China;School of Computer Science, Beijing Institute of Technology, Beijing, China

  • Venue:
  • ISICA'10 Proceedings of the 5th international conference on Advances in computation and intelligence
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In Chinese, frequent words, which always contain no significant information for information extraction, play an important role in the grammar structure of sentences. But the grammar information of these words is always ignored in Chinese segmentation. In this paper, for Chinese chunking, we dsesign an experiment to integrate the grammar information of frequent words and investigate the effect of this information on the chunking. We use conditional random fields for chunking, and rewrite the frequent words in the corpus to make them contain sentence structure information. The results show that the grammar information of frequent words, the number of which can be very small, can significantly increase the accuracy of the Chinese chunking.