Evaluation of features for sentence extraction on different types of corpora

  • Authors:
  • Chikashi Nobata;Satoshi Sekine;Hitoshi Isahara

  • Affiliations:
  • Communications Research Laboratory, Seika-cho, Soraku-gun, Kyoto, Japan;New York University, New York, NY;Communications Research Laboratory, Seika-cho, Soraku-gun, Kyoto, Japan

  • Venue:
  • MultiSumQA '03 Proceedings of the ACL 2003 workshop on Multilingual summarization and question answering - Volume 12
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

We report evaluation results for our summarization system and analyze the resulting summarization data for three different types of corpora. To develop a robust summarization system, we have created a system based on sentence extraction and applied it to summarize Japanese and English newspaper articles, obtained some of the top results at two evaluation workshops. We have also created sentence extraction data from Japanese lectures and evaluated our system with these data. In addition to the evaluation results, we analyze the relationships between key sentences and the features used in sentence extraction. We find that discrete combinations of features match distributions of key sentences better than sequential combinations.