Summarization with a joint model for sentence extraction and compression

  • Authors:
  • André F. T. Martins;Noah A. Smith

  • Affiliations:
  • Carnegie Mellon University, Pittsburgh, PA and Instituto de Telecomunicações, Instituto Superior Técnico, Lisboa, Portugal;Carnegie Mellon University, Pittsburgh, PA

  • Venue:
  • ILP '09 Proceedings of the Workshop on Integer Linear Programming for Natural Langauge Processing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Text summarization is one of the oldest problems in natural language processing. Popular approaches rely on extracting relevant sentences from the original documents. As a side effect, sentences that are too long but partly relevant are doomed to either not appear in the final summary, or prevent inclusion of other relevant sentences. Sentence compression is a recent framework that aims to select the shortest subsequence of words that yields an informative and grammatical sentence. This work proposes a one-step approach for document summarization that jointly performs sentence extraction and compression by solving an integer linear program. We report favorable experimental results on newswire data.