Summary evaluation: together we stand NPowER-ed

  • Authors:
  • George Giannakopoulos;Vangelis Karkaletsis

  • Affiliations:
  • Institute of Informatics and Telecommunications, NCSR Demokritos, Aghia Paraskevi, Attiki, Greece;Institute of Informatics and Telecommunications, NCSR Demokritos, Aghia Paraskevi, Attiki, Greece

  • Venue:
  • CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume 2
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Summary evaluation has been a distinct domain of research for several years. Human summary evaluation appears to be a high-level cognitive process and, thus, difficult to reproduce. Even though several automatic evaluation methods correlate well to human evaluations over systems, we fail to get equivalent results when judging individual summaries. In this work, we propose the NPowER evaluation method based on machine learning and a set of methods from the family of "n-gram graph"-based summary evaluation methods. First, we show that the combined, optimized use of the evaluation methods outperforms the individual ones. Second, we compare the proposed method to a combination of ROUGE metrics. Third, we study and discuss what can make future evaluation measures better, based on the results of feature selection. We show that we can easily provide per summary evaluations that are far superior to existing performance of evaluation systems and face different measures under a unified view.