A decision tree approach to sentence chunking

Authors:
Samuel W. K. Chan
Affiliations:
Dept. of Decision Sciences, The Chinese University of Hong Kong, Hong Kong SAR
Venue:
AI'07 Proceedings of the 20th Australian joint conference on Advances in artificial intelligence
Year:
2007

Citing 5
Cited 0

Foundations of statistical natural language processing

Foundations of statistical natural language processing
Mining online text

Communications of the ACM
A stochastic parts program and noun phrase parser for unrestricted text

ANLC '88 Proceedings of the second conference on Applied natural language processing
Representing text chunks

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Significant lexical relationships

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1

Quantified Score

Hi-index	0.01

Visualization

Abstract

This paper proposes an algorithm which can chunk a given sentence into meaningful and coherent segments. The algorithm is based on the assumption that segment boundaries can be identified by analyzing various information-theoretic measures of the part-of-speech (POS) n-grams within the sentence. The assumption is supported by a series of experiments using the POS-tagged corpus and Treebank from Academia Sinica. Experimental results show that the combination of different classifiers based on the measures improves the system coverage while maintaining its precision in our evaluation of 10, 000 sentences.