QARLA: a framework for the evaluation of text summarization systems

Authors:
Enrique Amigó;Julio Gonzalo;Anselmo Peñas;Felisa Verdejo
Affiliations:
Universidad Nacional de Educación a Distancia, Madrid - Spain;Universidad Nacional de Educación a Distancia, Madrid - Spain;Universidad Nacional de Educación a Distancia, Madrid - Spain;Universidad Nacional de Educación a Distancia, Madrid - Spain
Venue:
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Year:
2005

Citing 5
Cited 11

BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Automatic evaluation of summaries using N-gram co-occurrence statistics

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
The potential and limitations of automatic sentence extraction for summarization

HLT-NAACL-DUC '03 Proceedings of the HLT-NAACL 03 on Text summarization workshop - Volume 5
An empirical study of information synthesis tasks

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
ORANGE: a method for evaluating automatic evaluation metrics for machine translation

COLING '04 Proceedings of the 20th international conference on Computational Linguistics

MT evaluation: human-like vs. human acceptable

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Automatic summarising: The state of the art

Information Processing and Management: an International Journal
Context-aware discriminative phrase selection for statistical machine translation

StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Linguistic features for automatic evaluation of heterogenous MT systems

StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
On the robustness of syntactic and semantic features for automatic MT evaluation

StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
The contribution of linguistic features to automatic machine translation evaluation

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Significance tests of automatic machine translation evaluation metrics

Machine Translation
All in strings: a powerful string-based automatic MT evaluation metric with multiple granularities

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Text summarisation in progress: a literature review

Artificial Intelligence Review
Corroborating text evaluation results with heterogeneous measures

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Evaluating entity summarization using a game-based ground truth

ISWC'12 Proceedings of the 11th international conference on The Semantic Web - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a probabilistic framework, QARLA, for the evaluation of text summarisation systems. The input of the framework is a set of manual (reference) summaries, a set of baseline (automatic) summaries and a set of similarity metrics between summaries. It provides i) a measure to evaluate the quality of any set of similarity metrics, ii) a measure to evaluate the quality of a summary using an optimal set of similarity metrics, and iii) a measure to evaluate whether the set of baseline summaries is reliable or may produce biased results.Compared to previous approaches, our framework is able to combine different metrics and evaluate the quality of a set of metrics without any a-priori weighting of their relative importance. We provide quantitative evidence about the effectiveness of the approach to improve the automatic evaluation of text summarisation systems by combining several similarity metrics.