Multi-document summarization using A* search and discriminative training

Authors:
Ahmet Aker;Trevor Cohn;Robert Gaizauskas
Affiliations:
University of Sheffield, Sheffield, UK;University of Sheffield, Sheffield, UK;University of Sheffield, Sheffield, UK
Venue:
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Year:
2010

Citing 8
Cited 5

Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
Artificial intelligence: a modern approach

Artificial intelligence: a modern approach
Automatic condensation of electronic publications by sentence selection

Information Processing and Management: an International Journal - Special issue: summarizing text
New Methods in Automatic Extracting

Journal of the ACM (JACM)
TextTiling: segmenting text into multi-paragraph subtopic passages

Computational Linguistics
Minimum error rate training in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Generating image descriptions using dependency relational patterns

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Applying regression models to query-focused multi-document summarization

Information Processing and Management: an International Journal

Automatic summarization

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts of ACL 2011
Using bilingual information for cross-language document summarization

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Summarizing the differences in multilingual news

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Pazesh: a graph-based approach to increase readability of automatic text summaries

Canadian AI'11 Proceedings of the 24th Canadian conference on Advances in artificial intelligence
Cross-lingual training of summarization systems using annotated corpora in a foreign language

Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we address two key challenges for extractive multi-document summarization: the search problem of finding the best scoring summary and the training problem of learning the best model parameters. We propose an A* search algorithm to find the best extractive summary up to a given length, which is both optimal and efficient to run. Further, we propose a discriminative training algorithm which directly maximises the quality of the best summary, rather than assuming a sentence-level decomposition as in earlier work. Our approach leads to significantly better results than earlier techniques across a number of evaluation metrics.