Frequent words' grammar information in Chinese chunking

Authors:
Quan Qi;Li Liu;Yue Chen
Affiliations:
School of Computer Science, Beijing Institute of Technology, Beijing, China;School of Computer Science, Beijing Institute of Technology, Beijing, China;School of Computer Science, Beijing Institute of Technology, Beijing, China
Venue:
ISICA'10 Proceedings of the 5th international conference on Advances in computation and intelligence
Year:
2010

Citing 6
Cited 0

Foundations of statistical natural language processing

Foundations of statistical natural language processing
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Hybrid text chunking

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
An empirical study of Chinese chunking

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Efficiently inducing features of conditional random fields

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
Chinese chunk identification using SVMs plus sigmoid

IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In Chinese, frequent words, which always contain no significant information for information extraction, play an important role in the grammar structure of sentences. But the grammar information of these words is always ignored in Chinese segmentation. In this paper, for Chinese chunking, we dsesign an experiment to integrate the grammar information of frequent words and investigate the effect of this information on the chunking. We use conditional random fields for chunking, and rewrite the frequent words in the corpus to make them contain sentence structure information. The results show that the grammar information of frequent words, the number of which can be very small, can significantly increase the accuracy of the Chinese chunking.