Multi-document summarization via submodularity

Authors:
Jingxuan Li;Lei Li;Tao Li
Affiliations:
School of Computing and Information Sciences, Florida International University, Miami, USA 33199;School of Computing and Information Sciences, Florida International University, Miami, USA 33199;School of Computing and Information Sciences, Florida International University, Miami, USA 33199
Venue:
Applied Intelligence
Year:
2012

Citing 18
Cited 1

The budgeted maximum coverage problem

Information Processing Letters
Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition

Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition
Centroid-based summarization of multiple documents

Information Processing and Management: an International Journal
Robust generic and query-based summarisation

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 2
Manual and automatic evaluation of summaries

AS '02 Proceedings of the ACL-02 Workshop on Automatic Summarization - Volume 4
Bayesian query-focused summarization

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Cost-effective outbreak detection in networks

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Query-sensitive mutual reinforcement chain and its application in query-oriented multi-document summarization

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Personalized e-news monitoring agent system for tracking user-interested Chinese news events

Applied Intelligence
Multi-document summarization by sentence extraction

NAACL-ANLP-AutoSum '00 Proceedings of the 2000 NAACL-ANLP Workshop on Automatic Summarization
Topic-driven multi-document summarization with encyclopedic knowledge and spreading activation

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Exploring content models for multi-document summarization

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Manifold-ranking based topic-focused multi-document summarization

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Comparative document summarization via discriminative sentence selection

Proceedings of the 18th ACM conference on Information and knowledge management
Classifier subset selection for biomedical named entity recognition

Applied Intelligence
Multi-document summarization via budgeted maximization of submodular functions

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
MSSF: a multi-document summarization framework based on submodularity

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval

Ontology-enriched multi-document summarization in disaster management using submodular function

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Multi-document summarization is becoming an important issue in the Information Retrieval community. It aims to distill the most important information from a set of documents to generate a compressed summary. Given a set of documents as input, most of existing multi-document summarization approaches utilize different sentence selection techniques to extract a set of sentences from the document set as the summary. The submodularity hidden in the term coverage and the textual-unit similarity motivates us to incorporate this property into our solution to multi-document summarization tasks. In this paper, we propose a new principled and versatile framework for different multi-document summarization tasks using submodular functions (Nemhauser et al. in Math. Prog. 14(1):265---294, 1978) based on the term coverage and the textual-unit similarity which can be efficiently optimized through the improved greedy algorithm. We show that four known summarization tasks, including generic, query-focused, update, and comparative summarization, can be modeled as different variations derived from the proposed framework. Experiments on benchmark summarization data sets (e.g., DUC04-06, TAC08, TDT2 corpora) are conducted to demonstrate the efficacy and effectiveness of our proposed framework for the general multi-document summarization tasks.