Ranking human and machine summarization systems

Authors:
Peter Rankel;John M. Conroy;Eric V. Slud;Dianne P. O'Leary
Affiliations:
University of Maryland, College Park, Maryland;IDA/Center for Computing Sciences, Bowie, Maryland;University of Maryland, College Park, Maryland;University of Maryland, College Park, Maryland
Venue:
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Year:
2011

Citing 4
Cited 1

BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Automatic evaluation of summaries using N-gram co-occurrence statistics

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
The Pyramid Method: Incorporating human content selection variation in summarization evaluation

ACM Transactions on Speech and Language Processing (TSLP)
Mind the gap: dangers of divorcing evaluations of summary content from linguistic quality

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1

An assessment of the accuracy of automatic evaluation in summarization

Proceedings of Workshop on Evaluation Metrics and System Comparison for Automatic Summarization

Quantified Score

Hi-index	0.00

Visualization

Abstract

The Text Analysis Conference (TAC) ranks summarization systems by their average score over a collection of document sets. We investigate the statistical appropriateness of this score and propose an alternative that better distinguishes between human and machine evaluation systems.