Performance confidence estimation for automatic summarization

Authors:
Annie Louis;Ani Nenkova
Affiliations:
University of Pennsylvania;University of Pennsylvania
Venue:
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Year:
2009

Citing 12
Cited 7

Automatic condensation of electronic publications by sentence selection

Information Processing and Management: an International Journal - Special issue: summarizing text
Predicting query performance

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
The automated acquisition of topic signatures for text summarization

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
SPoT: a trainable sentence planner

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Learning to estimate query difficulty: including applications to missing content detection and distributed information retrieval

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
An analysis of the AskMSR question-answering system

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
What makes a query difficult?

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Topic-focused multi-document summarization using an approximate oracle score

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
DUC in context

Information Processing and Management: an International Journal
Modeling local coherence: An entity-based approach

Computational Linguistics
Predicting success in machine translation

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Localization of difficult-to-translate phrases

StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation

Report on INEX 2009

ACM SIGIR Forum
Overview of the 2009 QA track: towards a common task for QA, focused IR and automatic summarization systems

INEX'09 Proceedings of the Focused retrieval and evaluation, and 8th international conference on Initiative for the evaluation of XML retrieval
Who wrote what where: analyzing the content of human and automatic summaries

WASDGML '11 Proceedings of the Workshop on Automatic Summarization for Different Genres, Media, and Languages
Overview of the INEX 2010 question answering track (QA@INEX)

INEX'10 Proceedings of the 9th international conference on Initiative for the evaluation of XML retrieval: comparative evaluation of focused retrieval
Report on INEX 2011

ACM SIGIR Forum
Automatic metrics for genre-specific text quality

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Student Research Workshop
Automatically assessing machine summary content without a gold standard

Computational Linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

We address the task of automatically predicting if summarization system performance will be good or bad based on features derived directly from either single- or multi-document inputs. Our labelled corpus for the task is composed of data from large scale evaluations completed over the span of several years. The variation of data between years allows for a comprehensive analysis of the robustness of features, but poses a challenge for building a combined corpus which can be used for training and testing. Still, we find that the problem can be mitigated by appropriately normalizing for differences within each year. We examine different formulations of the classification task which considerably influence performance. The best results are 84% prediction accuracy for single- and 74% for multi-document summarization.