Foundations of statistical natural language processing
Foundations of statistical natural language processing
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
An empirical study of Chinese chunking
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Efficiently inducing features of conditional random fields
UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
Chinese chunk identification using SVMs plus sigmoid
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Hi-index | 0.00 |
In Chinese, frequent words, which always contain no significant information for information extraction, play an important role in the grammar structure of sentences. But the grammar information of these words is always ignored in Chinese segmentation. In this paper, for Chinese chunking, we dsesign an experiment to integrate the grammar information of frequent words and investigate the effect of this information on the chunking. We use conditional random fields for chunking, and rewrite the frequent words in the corpus to make them contain sentence structure information. The results show that the grammar information of frequent words, the number of which can be very small, can significantly increase the accuracy of the Chinese chunking.