A new sentence compression dataset and its use in an abstractive generate-and-rank sentence compressor

  • Authors:
  • Dimitrios Galanis;Ion Androutsopoulos

  • Affiliations:
  • Athens University of Economics and Business, Greece;Athens University of Economics and Business, Greece, and Digital Curation Unit -- IMIS, Research Center "Athena", Greece

  • Venue:
  • UCNLG+EVAL '11 Proceedings of the UCNLG+Eval: Language Generation and Evaluation Workshop
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Sentence compression has attracted much interest in recent years, but most sentence compressors are extractive, i.e., they only delete words. There is a lack of appropriate datasets to train and evaluate abstractive sentence compressors, i.e., methods that apart from deleting words can also rephrase expressions. We present a new dataset that contains candidate extractive and abstractive compressions of source sentences. The candidate compressions are annotated with human judgements for grammaticality and meaning preservation. We discuss how the dataset was created, and how it can be used in generate-and-rank abstractive sentence compressors. We also report experimental results with a novel abstractive sentence compressor that uses the dataset.