Summarizing text documents: sentence selection and evaluation metrics
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
NAACL-ANLP-AutoSum '00 Proceedings of the 2000 NAACL-ANLPWorkshop on Automatic summarization - Volume 4
A comparison of rankings produced by summarization evaluation measures
NAACL-ANLP-AutoSum '00 Proceedings of the 2000 NAACL-ANLPWorkshop on Automatic summarization - Volume 4
Centroid-based summarization of multiple documents
Information Processing and Management: an International Journal
The Pyramid Method: Incorporating human content selection variation in summarization evaluation
ACM Transactions on Speech and Language Processing (TSLP)
Computational Linguistics
Studies on intrinsic summary evaluation
International Journal of Artificial Intelligence and Soft Computing
Text summarisation in progress: a literature review
Artificial Intelligence Review
Revisiting centrality-as-relevance: support sets and similarity as geometric proximity
Journal of Artificial Intelligence Research
Combining summaries using unsupervised rank aggregation
CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part II
Automatically assessing machine summary content without a gold standard
Computational Linguistics
Hi-index | 0.01 |
We present a series of experiments to demonstrate the validity of Relative Utility (RU) as a measure for evaluating extractive summarizers. RU is applicable in both single-document and multi-document summarization, is extendable to arbitrary compression rates with no extra annotation effort, and takes into account both random system performance and interjudge agreement. Our results using the JHU summary corpus indicate that RU is a reasonable and often superior alternative to several common evaluation metrics.