Learning for Text Summarization Using Labeled and Unlabeled Sentences

  • Authors:
  • Massih-Reza Amini;Patrick Gallinari

  • Affiliations:
  • -;-

  • Venue:
  • ICANN '01 Proceedings of the International Conference on Artificial Neural Networks
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe an original machine learning approach for automatic text summarization; it works by extracting the most relevant sentences from a document. Since labeled corpora are difficult to collect for this task, we propose a semi-supervised method, which makes use of a small set of labeled sentences together with a large set of unlabeled documents, for improving the performances of summary systems. We show that this method is an instance of the Classification EM algorithm in the case of gaussian densities, and that it can also be used in a non-parametric setting. We finally provide an empirical evaluation on the Reuters news-wire corpus.