Empirical methods for artificial intelligence
Empirical methods for artificial intelligence
Centering: a framework for modeling the local coherence of discourse
Computational Linguistics
Text Segmentation into Paragraphs Based on Local Text Cohesion
TSD '01 Proceedings of the 4th International Conference on Text, Speech and Dialogue
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
TextTiling: segmenting text into multi-paragraph subtopic passages
Computational Linguistics
Microsoft natural language understanding system and grammar checker
ANLC '97 Proceedings of the fifth conference on Applied natural language processing: Descriptions of system demonstrations and videos
A maximum-entropy-inspired parser
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Assigning function tags to parsed text
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Detecting shifts in news stories for paragraph extraction
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
A simple pattern-matching algorithm for recovering empty nodes and their antecedents
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Ranking algorithms for named-entity extraction: boosting and the voted perceptron
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Sentence level discourse parsing using syntactic and lexical information
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Can text structure be incompatible with rhetorical structure?
INLG '00 Proceedings of the first international conference on Natural language generation - Volume 14
Variation of entropy and parse trees of sentences as a function of the sentence number
EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Using linguistically motivated features for paragraph boundary identification
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Hi-index | 0.00 |
We propose and motivate a novel task: paragraph segmentation. We discuss and compare this task with text segmentation and discourse parsing. We present a system that performs the task with high accuracy. A variety of features is proposed and examined in detail. The best models turn out to include lexical, coherence, and structural features.