Multi-document summarization by maximizing informative content-words

Authors:
Wen-tau Yih;Joshua Goodman;Lucy Vanderwende;Hisami Suzuki
Affiliations:
Microsoft Research, Redmond, WA;Microsoft Research, Redmond, WA;Microsoft Research, Redmond, WA;Microsoft Research, Redmond, WA
Venue:
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Year:
2007

Citing 9
Cited 28

Natural language parsing as statistical pattern recognition

Natural language parsing as statistical pattern recognition
The use of MMR, diversity-based reranking for reordering documents and producing summaries

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
The automated acquisition of topic signatures for text summarization

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Automatic evaluation of summaries using N-gram co-occurrence statistics

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Manual and automatic evaluation of summaries

AS '02 Proceedings of the ACL-02 Workshop on Automatic Summarization - Volume 4
A compositional context sensitive multi-document summarizer: exploring the factors that influence summarization

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Syntactic simplification for improving content selection in multi-document summarization

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Improved affinity graph based multi-document summarization

NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Multi-document summarization by graph search and matching

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence

Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Estimating Risk of Picking a Sentence for Document Summarization

CICLing '09 Proceedings of the 10th International Conference on Computational Linguistics and Intelligent Text Processing
Query-Focused Summarization by Combining Topic Model and Affinity Propagation

APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
An Extractive Text Summarizer Based on Significant Words

ICCPOL '09 Proceedings of the 22nd International Conference on Computer Processing of Oriental Languages. Language Technology for the Knowledge-based Economy
Text summarization model based on maximum coverage problem and its variant

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
A scalable global model for summarization

ILP '09 Proceedings of the Workshop on Integer Linear Programming for Natural Langauge Processing
Comparative document summarization via discriminative sentence selection

Proceedings of the 18th ACM conference on Information and knowledge management
Multi-document summarization using sentence-based topic models

ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Focused multi-document summarization: human summarization activity vs. automated systems techniques

Journal of Computing Sciences in Colleges
BioSnowball: automated population of Wikis

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Document update summarization using incremental hierarchical clustering

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Learning to generate summary as structured output

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Capturing user reading behaviors for personalized document summarization

Proceedings of the 16th international conference on Intelligent user interfaces
Opinion summarization with integer linear programming formulation for sentence extraction and ordering

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Integrating Document Clustering and Multidocument Summarization

ACM Transactions on Knowledge Discovery from Data (TKDD)
Automatic summarization

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts of ACL 2011
Jointly learning to extract and compress

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Query snowball: a co-occurrence-based approach to multi-document summarization for question answering

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
iDVS: an interactive multi-document visual summarization system

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
Weighted consensus multi-document summarization

Information Processing and Management: an International Journal
Using wikipedia anchor text and weighted clustering coefficient to enhance the traditional multi-document summarization

CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part II
Multi-aspect query summarization by composite query

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Generating event storylines from microblogs

Proceedings of the 21st ACM international conference on Information and knowledge management
A progressive sentence selection strategy for document summarization

Information Processing and Management: an International Journal
Document summarisation on mobile devices using non-negative matrix factorisation

International Journal of Computer Applications in Technology
Rhetorics-based multi-document summarization

Expert Systems with Applications: An International Journal
Sumblr: continuous summarization of evolving tweet streams

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Summarization of scientific documents by detecting common facts in citations

Future Generation Computer Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We show that a simple procedure based on maximizing the number of informative content-words can produce some of the best reported results for multi-document summarization. We first assign a score to each term in the document cluster, using only frequency and position information, and then we find the set of sentences in the document cluster that maximizes the sum of these scores, subject to length constraints. Our overall results are the best reported on the DUC-2004 summarization task for the ROUGE-1 score, and are the best, but not statistically significantly different from the best system in MSE-2005. Our system is also substantially simpler than the previous best system.