Supervised ranking in open-domain text summarization

  • Authors:
  • Tadashi Nomoto;Yuji Matsumoto

  • Affiliations:
  • National Institute of Japanese Literature, Tokyo, Japan;Nara Institute of Science and Technology, Nara, Japan

  • Venue:
  • ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

The paper proposes and empirically motivates an integration of supervised learning with unsupervised learning to deal with human biases in summarization. In particular, we explore the use of probabilistic decision tree within the clustering framework to account for the variation as well as regularity in human created summaries. The corpus of human created extracts is created from a newspaper corpus and used as a test set. We build probabilistic decision trees of different flavors and integrate each of them with the clustering framework. Experiments with the corpus demonstrate that the mixture of the two paradigms generally gives a significant boost in performance compared to cases where either of the two is considered alone.