Combining coherence models and machine translation evaluation metrics for summarization evaluation

Authors:
Ziheng Lin;Chang Liu;Hwee Tou Ng;Min-Yen Kan
Affiliations:
SAP Research, SAP Asia Pte Ltd, Singapore;National University of Singapore, Singapore;National University of Singapore, Singapore;National University of Singapore, Singapore
Venue:
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Year:
2012

Citing 9
Cited 0

Making large-scale support vector machine learning practical

Advances in kernel methods
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Automatic evaluation of summaries using N-gram co-occurrence statistics

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
ParaEval: using paraphrases to evaluate summaries automatically

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Modeling local coherence: An entity-based approach

Computational Linguistics
Mind the gap: dangers of divorcing evaluations of summary content from linguistic quality

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Automatic evaluation of linguistic quality in multi-document summarization

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
TESLA: translation evaluation of sentences with linear-programming-based analysis

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Automatically evaluating text coherence using discourse relations

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

An ideal summarization system should produce summaries that have high content coverage and linguistic quality. Many state-of-the-art summarization systems focus on content coverage by extracting content-dense sentences from source articles. A current research focus is to process these sentences so that they read fluently as a whole. The current AESOP task encourages research on evaluating summaries on content, readability, and overall responsiveness. In this work, we adapt a machine translation metric to measure content coverage, apply an enhanced discourse coherence model to evaluate summary readability, and combine both in a trained regression model to evaluate overall responsiveness. The results show significantly improved performance over AESOP 2011 submitted metrics.