Using N-Grams to understand the nature of summaries

Authors:
Michele Banko;Lucy Vanderwende
Affiliations:
One Microsoft Way, Redmond, WA;One Microsoft Way, Redmond, WA
Venue:
HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
Year:
2004

Citing 9
Cited 8

Generating summaries of multiple news articles

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Towards multidocument summarization by reformulation: progress and prospects

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Summarizing Similarities and Differences Among Related Documents

Information Retrieval
Advances in Automatic Text Summarization

Advances in Automatic Text Summarization
Using hidden Markov modeling to decompose human-written summaries

Computational Linguistics - Summarization
Automatic evaluation of summaries using N-gram co-occurrence statistics

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
The potential and limitations of automatic sentence extraction for summarization

HLT-NAACL-DUC '03 Proceedings of the HLT-NAACL 03 on Text summarization workshop - Volume 5
Multi-document summarization by sentence extraction

NAACL-ANLP-AutoSum '00 Proceedings of the 2000 NAACL-ANLP Workshop on Automatic Summarization
Inferring strategies for sentence ordering in multidocument news summarization

Journal of Artificial Intelligence Research

Sentence Fusion for Multidocument News Summarization

Computational Linguistics
A compositional context sensitive multi-document summarizer: exploring the factors that influence summarization

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Summarization system evaluation revisited: N-gram graphs

ACM Transactions on Speech and Language Processing (TSLP)
The Decomposition of Human-Written Book Summaries

CICLing '09 Proceedings of the 10th International Conference on Computational Linguistics and Intelligent Text Processing
Using word sequences for text summarization

TSD'06 Proceedings of the 9th international conference on Text, Speech and Dialogue
A behavioural mode research on user-focus summarization

Mathematical and Computer Modelling: An International Journal
Extending the entity-based coherence model with multiple ranks

EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Sentence fusion for multidocument news summarization

Computational Linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Although single-document summarization is a well-studied task, the nature of multi-document summarization is only beginning to be studied in detail. While close attention has been paid to what technologies are necessary when moving from single to multi-document summarization, the properties of human-written multi-document summaries have not been quantified. In this paper, we empirically characterize human-written summaries provided in a widely used summarization corpus by attempting to answer the questions: Can multi-document summaries that are written by humans be characterized as extractive or generative? Are multi-document summaries less extractive than single-document summaries? Our results suggest that extraction-based techniques which have been successful for single-document summarization may not be sufficient when summarizing multiple documents.