BLEU: a method for automatic evaluation of machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Automatic evaluation of summaries using N-gram co-occurrence statistics
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
The Pyramid Method: Incorporating human content selection variation in summarization evaluation
ACM Transactions on Speech and Language Processing (TSLP)
Mind the gap: dangers of divorcing evaluations of summary content from linguistic quality
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
An assessment of the accuracy of automatic evaluation in summarization
Proceedings of Workshop on Evaluation Metrics and System Comparison for Automatic Summarization
Hi-index | 0.00 |
The Text Analysis Conference (TAC) ranks summarization systems by their average score over a collection of document sets. We investigate the statistical appropriateness of this score and propose an alternative that better distinguishes between human and machine evaluation systems.