The budgeted maximum coverage problem
Information Processing Letters
Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition
Centroid-based summarization of multiple documents
Information Processing and Management: an International Journal
Robust generic and query-based summarisation
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 2
Manual and automatic evaluation of summaries
AS '02 Proceedings of the ACL-02 Workshop on Automatic Summarization - Volume 4
Bayesian query-focused summarization
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Cost-effective outbreak detection in networks
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Multi-document summarization by sentence extraction
NAACL-ANLP-AutoSum '00 Proceedings of the 2000 NAACL-ANLP Workshop on Automatic Summarization
Topic-driven multi-document summarization with encyclopedic knowledge and spreading activation
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Exploring content models for multi-document summarization
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Manifold-ranking based topic-focused multi-document summarization
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Comparative document summarization via discriminative sentence selection
Proceedings of the 18th ACM conference on Information and knowledge management
Classifier subset selection for biomedical named entity recognition
Applied Intelligence
Multi-document summarization via budgeted maximization of submodular functions
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
MSSF: a multi-document summarization framework based on submodularity
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Ontology-enriched multi-document summarization in disaster management using submodular function
Information Sciences: an International Journal
Hi-index | 0.00 |
Multi-document summarization is becoming an important issue in the Information Retrieval community. It aims to distill the most important information from a set of documents to generate a compressed summary. Given a set of documents as input, most of existing multi-document summarization approaches utilize different sentence selection techniques to extract a set of sentences from the document set as the summary. The submodularity hidden in the term coverage and the textual-unit similarity motivates us to incorporate this property into our solution to multi-document summarization tasks. In this paper, we propose a new principled and versatile framework for different multi-document summarization tasks using submodular functions (Nemhauser et al. in Math. Prog. 14(1):265---294, 1978) based on the term coverage and the textual-unit similarity which can be efficiently optimized through the improved greedy algorithm. We show that four known summarization tasks, including generic, query-focused, update, and comparative summarization, can be modeled as different variations derived from the proposed framework. Experiments on benchmark summarization data sets (e.g., DUC04-06, TAC08, TDT2 corpora) are conducted to demonstrate the efficacy and effectiveness of our proposed framework for the general multi-document summarization tasks.