Discourse indicators for content selection in summarization

Authors:
Annie Louis;Aravind Joshi;Ani Nenkova
Affiliations:
University of Pennsylvania, Philadelphia, PA;University of Pennsylvania, Philadelphia, PA;University of Pennsylvania, Philadelphia, PA
Venue:
SIGDIAL '10 Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Year:
2010

Citing 19
Cited 8

New Methods in Automatic Extracting

Journal of the ACM (JACM)
The rhetorical parsing of unrestricted texts: a surface-based approach

Computational Linguistics
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
The automated acquisition of topic signatures for text summarization

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Abstract generation based on rhetorical structure extraction

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Automatic evaluation of summaries using N-gram co-occurrence statistics

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Sentence level discourse parsing using syntactic and lexical information

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Representing Discourse Coherence: A Corpus-Based Study

Computational Linguistics
Building a discourse-tagged corpus in the framework of Rhetorical Structure Theory

SIGDIAL '01 Proceedings of the Second SIGdial Workshop on Discourse and Dialogue - Volume 16
Revisions that improve cohesion in multi-document summaries: a preliminary study

AS '02 Proceedings of the ACL-02 Workshop on Automatic Summarization - Volume 4
Manual and automatic evaluation of summaries

AS '02 Proceedings of the ACL-02 Workshop on Automatic Summarization - Volume 4
A compositional context sensitive multi-document summarizer: exploring the factors that influence summarization

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Paragraph-, word-, and coherence-based approaches to sentence ranking: a comparison of algorithm and human performance

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Topic-focused multi-document summarization using an approximate oracle score

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Using automatically labelled examples to classify rhetorical relations: An assessment

Natural Language Engineering
Evaluation of Automatic Text Summarization Methods Based on Rhetorical Structure Theory

ISDA '08 Proceedings of the 2008 Eighth International Conference on Intelligent Systems Design and Applications - Volume 02
LexRank: graph-based lexical centrality as salience in text summarization

Journal of Artificial Intelligence Research
Classification of discourse coherence relations: an exploratory study using multiple knowledge sources

SigDIAL '06 Proceedings of the 7th SIGdial Workshop on Discourse and Dialogue
Automatic sense prediction for implicit discourse relations in text

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2

Semi-supervised discourse relation classification with structural learning

CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part I
Graph-based methods for multi-document summarization: exploring relationship maps, complex networks and discourse information

PROPOR'12 Proceedings of the 10th international conference on Computational Processing of the Portuguese Language
Rhetorical relations for information retrieval

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
A reranking model for discourse segmentation using subtree features

SIGDIAL '12 Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Discourse structure and language technology

Natural Language Engineering
The effectiveness of automatic text summarization in mobile learning contexts

Computers & Education
Summarization of legal texts with high cohesion and automatic compression rate

JSAI-isAI'12 Proceedings of the 2012 international conference on New Frontiers in Artificial Intelligence
Extractive single-document summarization based on genetic operators and guided local search

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present analyses aimed at eliciting which specific aspects of discourse provide the strongest indication for text importance. In the context of content selection for single document summarization of news, we examine the benefits of both the graph structure of text provided by discourse relations and the semantic sense of these relations. We find that structure information is the most robust indicator of importance. Semantic sense only provides constraints on content selection but is not indicative of important content by itself. However, sense features complement structure information and lead to improved performance. Further, both types of discourse information prove complementary to non-discourse features. While our results establish the usefulness of discourse features, we also find that lexical overlap provides a simple and cheap alternative to discourse for computing text structure with comparable performance for the task of content selection.