Topic-focused multi-document summarization using an approximate oracle score

Authors:
John M. Conroy;Judith D. Schlesinger;Dianne P. O'Leary
Affiliations:
IDA Center for Computing Sciences, Bowie, Maryland;IDA Center for Computing Sciences, Bowie, Maryland;University of Maryland, College Park, Maryland
Venue:
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Year:
2006

Citing 4
Cited 22

A trainable document summarizer

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
The use of MMR, diversity-based reranking for reordering documents and producing summaries

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Accurate methods for the statistics of surprise and coincidence

Computational Linguistics - Special issue on using large corpora: I
The automated acquisition of topic signatures for text summarization

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1

Developing learning strategies for topic-based summarization

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Query-Focused Summarization by Combining Topic Model and Affinity Propagation

APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
Measuring importance and query relevance in topic-focused multi-document summarization

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Performance confidence estimation for automatic summarization

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
A scalable global model for summarization

ILP '09 Proceedings of the Workshop on Integer Linear Programming for Natural Langauge Processing
Automatically evaluating content selection in summarization without human models

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Focused multi-document summarization: human summarization activity vs. automated systems techniques

Journal of Computing Sciences in Colleges
Arabic/English multi-document summarization with CLASSY: the past and the future

CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
A hybrid hierarchical model for multi-document summarization

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Discourse indicators for content selection in summarization

SIGDIAL '10 Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Applying regression models to query-focused multi-document summarization

Information Processing and Management: an International Journal
Nouveau-rouge: A novelty metric for update summarization

Computational Linguistics
Automatic summarization

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts of ACL 2011
Discovery of topically coherent sentences for extractive summarization

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Query-focused multi-document summarization: Automatic data annotations and supervised learning approaches

Natural Language Engineering
A progressive sentence selection strategy for document summarization

Information Processing and Management: an International Journal
Rhetorics-based multi-document summarization

Expert Systems with Applications: An International Journal
Automatically assessing machine summary content without a gold standard

Computational Linguistics
Extractive summarisation via sentence removal: condensing relevant sentences into a short summary

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Gem-based entity-knowledge maintenance

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Multi-document text summarization using topic model and fuzzy logic

MLDM'13 Proceedings of the 9th international conference on Machine Learning and Data Mining in Pattern Recognition
An unsupervised cascade learning scheme for 'cluster-theme keywords' structure extraction from scientific papers

Journal of Information Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider the problem of producing a multi-document summary given a collection of documents. Since most successful methods of multi-document summarization are still largely extractive, in this paper, we explore just how well an extractive method can perform. We introduce an "oracle" score, based on the probability distribution of unigrams in human summaries. We then demonstrate that with the oracle score, we can generate extracts which score, on average, better than the human summaries, when evaluated with ROUGE. In addition, we introduce an approximation to the oracle score which produces a system with the best known performance for the 2005 Document Understanding Conference (DUC) evaluation.