Automatic evaluation of summaries using N-gram co-occurrence statistics

  • Authors:
  • Chin-Yew Lin;Eduard Hovy

  • Affiliations:
  • University of Southern California, Marina del Rey, CA;University of Southern California, Marina del Rey, CA

  • Venue:
  • NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
  • Year:
  • 2003

Quantified Score

Hi-index 0.01

Visualization

Abstract

Following the recent adoption by the machine translation community of automatic evaluation using the BLEU/NIST scoring process, we conduct an in-depth study of a similar idea for evaluating summaries. The results show that automatic evaluation using unigram co-occurrences between summary pairs correlates surprising well with human evaluations, based on various statistical metrics; while direct application of the BLEU evaluation procedure does not always give good results.