Foundations of statistical natural language processing
Foundations of statistical natural language processing
Communications of the ACM
A stochastic parts program and noun phrase parser for unrestricted text
ANLC '88 Proceedings of the second conference on Applied natural language processing
EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Significant lexical relationships
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Hi-index | 0.01 |
This paper proposes an algorithm which can chunk a given sentence into meaningful and coherent segments. The algorithm is based on the assumption that segment boundaries can be identified by analyzing various information-theoretic measures of the part-of-speech (POS) n-grams within the sentence. The assumption is supported by a series of experiments using the POS-tagged corpus and Treebank from Academia Sinica. Experimental results show that the combination of different classifiers based on the measures improves the system coverage while maintaining its precision in our evaluation of 10, 000 sentences.