Procedure for quantitatively comparing the syntactic coverage of English grammars
HLT '91 Proceedings of the workshop on Speech and Natural Language
Word association norms, mutual information, and lexicography
ACL '89 Proceedings of the 27th annual meeting on Association for Computational Linguistics
The Penn Chinese TreeBank: Phrase structure annotation of a large corpus
Natural Language Engineering
An unsupervised approach to recognizing discourse relations
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Sentence level discourse parsing using syntactic and lexical information
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Discourse chunking and its application to sentence compression
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Sentential structure and discourse parsing
DiscAnnotation '04 Proceedings of the 2004 ACL Workshop on Discourse Annotation
Chinese sentence segmentation as comma classification
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Hi-index | 0.00 |
The Chinese comma signals the boundary of discourse units and also anchors discourse relations between adjacent text spans. In this work, we propose a discourse structure-oriented classification of the comma that can be automatically extracted from the Chinese Treebank based on syntactic patterns. We then experimented with two supervised learning methods that automatically disambiguate the Chinese comma based on this classification. The first method integrates comma classification into parsing, and the second method adopts a "post-processing" approach that extracts features from automatic parses to train a classifier. The experimental results show that the second approach compares favorably against the first approach.