A web-trained extraction summarization system

Authors:
Liang Zhou;Eduard Hovy
Affiliations:
USC Information Sciences Institute, Marina del Rey, CA;USC Information Sciences Institute, Marina del Rey, CA
Venue:
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Year:
2003

Citing 8
Cited 6

A trainable document summarizer

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
The automatic construction of large-scale corpora for summarization research

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A vector space model for automatic indexing

Communications of the ACM
Statistics-Based Summarization - Step One: Sentence Compression

Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Bitext maps and alignment via pattern recognition

Computational Linguistics
Char_align: a program for aligning parallel texts at the character level

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
The automated acquisition of topic signatures for text summarization

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
K-vec: a new approach for aligning parallel texts

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2

Task-based evaluation of text summarization using Relevance Prediction

Information Processing and Management: an International Journal
Developing learning strategies for topic-based summarization

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Intertopic information mining for query-based summarization

Journal of the American Society for Information Science and Technology
Applying regression models to query-focused multi-document summarization

Information Processing and Management: an International Journal
Fuzzy evolutionary optimization modeling and its applications to unsupervised categorization and extractive summarization

Expert Systems with Applications: An International Journal
Summarization of legal texts with high cohesion and automatic compression rate

JSAI-isAI'12 Proceedings of the 2012 international conference on New Frontiers in Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

A serious bottleneck in the development of trainable text summarization systems is the shortage of training data. Constructing such data is a very tedious task, especially because there are in general many different correct ways to summarize a text. Fortunately we can utilize the Internet as a source of suitable training data. In this paper, we present a summarization system that uses the web as the source of training data. The procedure involves structuring the articles downloaded from various websites, building adequate corpora of (summary, text) and (extract, text) pairs, training on positive and negative data, and automatically learning to perform the task of extraction-based summarization at a level comparable to the best DUC systems.