A comprehensive comparative evaluation of RST-based summarization methods

Authors:
Vinícius Rodrigues Uzêda;Thiago Alexandre Salgueiro Pardo;Maria Das Graças Volpe Nunes
Affiliations:
Universidade de São Paulo, Brazil;Universidade de São Paulo, Brazil;Universidade de São Paulo, Brazil
Venue:
ACM Transactions on Speech and Language Processing (TSLP)
Year:
2010

Citing 19
Cited 4

Automatic text processing: the transformation, analysis, and retrieval of information by computer

Automatic text processing: the transformation, analysis, and retrieval of information by computer
The Theory and Practice of Discourse Parsing and Summarization

The Theory and Practice of Discourse Parsing and Summarization
Advances in Automatic Text Summarization

Advances in Automatic Text Summarization
Finding the WRITE Stuff: Automatic Identification of Discourse Structure in Student Essays

IEEE Intelligent Systems
The rhetorical parsing, summarization, and generation of natural language texts

The rhetorical parsing, summarization, and generation of natural language texts
The rhetorical parsing, summarization, and generation of natural language texts

The rhetorical parsing, summarization, and generation of natural language texts
The automatic translation of discourse structures

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Veins Theory: a model of global discourse cohesion and coherence

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Abstract generation based on rhetorical structure extraction

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Automatic evaluation of summaries using N-gram co-occurrence statistics

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Sentence level discourse parsing using syntactic and lexical information

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Coherence in natural language: data structures and applications

Coherence in natural language: data structures and applications
Building a discourse-tagged corpus in the framework of Rhetorical Structure Theory

SIGDIAL '01 Proceedings of the Second SIGdial Workshop on Discourse and Dialogue - Volume 16
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Evaluation of Automatic Text Summarization Methods Based on Rhetorical Structure Theory

ISDA '08 Proceedings of the 2008 Eighth International Conference on Intelligent Systems Design and Applications - Volume 02
The automatic creation of literature abstracts

IBM Journal of Research and Development
Machine-made index for technical literature: an experiment

IBM Journal of Research and Development
GistSumm: a summarization tool based on a new extractive method

PROPOR'03 Proceedings of the 6th international conference on Computational processing of the Portuguese language
Review and evaluation of dizer – an automatic discourse analyzer for brazilian portuguese

PROPOR'06 Proceedings of the 7th international conference on Computational Processing of the Portuguese Language

Revisiting centrality-as-relevance: support sets and similarity as geometric proximity

Journal of Artificial Intelligence Research
Discourse structure and language technology

Natural Language Engineering
Self reinforcement for important passage retrieval

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Revisiting centrality-as-relevance: support sets and similarity as geometric proximity: extended abstract

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Motivated by governmental, commercial and academic interests, and due to the growing amount of information, mainly online, automatic text summarization area has experienced an increasing number of researches and products, which led to a countless number of summarization methods. In this paper, we present a comprehensive comparative evaluation of the main automatic text summarization methods based on Rhetorical Structure Theory (RST), claimed to be among the best ones. We compare our results to superficial summarizers, which belong to a paradigm with severe limitations, and to hybrid methods, combining RST and superficial methods. We also test voting systems and machine learning techniques trained on RST features. We run experiments for English and Brazilian Portuguese languages and compare the results obtained by using manually and automatically parsed texts. Our results systematically show that all RST methods have comparable overall performance and that they outperform most of the superficial methods. Machine learning techniques achieved high accuracy in the classification of text segments worth of being in the summary, but were not able to produce more informative summaries than the regular RST methods.