Jointly learning to extract and compress

Authors:
Taylor Berg-Kirkpatrick;Dan Gillick;Dan Klein
Affiliations:
University of California at Berkeley;University of California at Berkeley;University of California at Berkeley
Venue:
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Year:
2011

Citing 19
Cited 4

The use of MMR, diversity-based reranking for reordering documents and producing summaries

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Fast training of support vector machines using sequential minimal optimization

Advances in kernel methods
Statistics-Based Summarization - Step One: Sentence Compression

Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Ultraconservative online algorithms for multiclass problems

The Journal of Machine Learning Research
Support vector machine learning for interdependent and structured output spaces

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Improving summarization performance by sentence compression: a pilot study

AsianIR '03 Proceedings of the sixth international workshop on Information retrieval with Asian languages - Volume 11
An end-to-end discriminative approach to machine translation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Enhancing diversity, coverage and balance for summarization through structure learning

Proceedings of the 18th international conference on World wide web
FastSum: fast and accurate query-based multi-document summarization

HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
Practical structured learning techniques for natural language processing

Practical structured learning techniques for natural language processing
Summarization with a joint model for sentence extraction and compression

ILP '09 Proceedings of the Workshop on Integer Linear Programming for Natural Langauge Processing
A scalable global model for summarization

ILP '09 Proceedings of the Workshop on Integer Linear Programming for Natural Langauge Processing
Online large-margin training of syntactic and structural translation features

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Learning and inference for hierarchically split PCFGs

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Global inference for sentence compression an integer linear programming approach

Journal of Artificial Intelligence Research
Multi-document summarization by maximizing informative content-words

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Document summarization using conditional random fields

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Automatic generation of story highlights

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Non-expert evaluation of summarization systems is risky

CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk

Dual decomposition with many overlapping components

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Large-margin learning of submodular summarization models

EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Multiple aspect summarization using integer linear programming

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
PPSGen: learning to generate presentation slides for academic papers

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

We learn a joint model of sentence extraction and compression for multi-document summarization. Our model scores candidate summaries according to a combined linear model whose features factor over (1) the n-gram types in the summary and (2) the compressions used. We train the model using a margin-based objective whose loss captures end summary quality. Because of the exponentially large set of candidate summaries, we use a cutting-plane algorithm to incrementally detect and add active constraints efficiently. Inference in our model can be cast as an ILP and thereby solved in reasonable time; we also present a fast approximation scheme which achieves similar performance. Our jointly extracted and compressed summaries outperform both unlearned baselines and our learned extraction-only system on both ROUGE and Pyramid, without a drop in judged linguistic quality. We achieve the highest published ROUGE results to date on the TAC 2008 data set.