Automatic evaluation of linguistic quality in multi-document summarization

Authors:
Emily Pitler;Annie Louis;Ani Nenkova
Affiliations:
University of Pennsylvania, Philadelphia, PA;University of Pennsylvania, Philadelphia, PA;University of Pennsylvania, Philadelphia, PA
Venue:
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Year:
2010

Citing 22
Cited 13

Constructing literature abstracts by computer: techniques and prospects

Information Processing and Management: an International Journal - Special issue on natural language processing and information retrieval
Centering: a framework for modeling the local coherence of discourse

Computational Linguistics
Summarization beyond sentence extraction: a probabilistic approach to sentence compression

Artificial Intelligence
The automatic generation of literature abstracts: an approach based on the identification of self-indicating phrases

SIGIR '80 Proceedings of the 3rd annual ACM conference on Research and development in information retrieval
Optimizing search engines using clickthrough data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Automatic evaluation of summaries using N-gram co-occurrence statistics

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
References to named entities: a corpus study

NAACL-Short '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers - Volume 2
Accurate unlexicalized parsing

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Probabilistic text structuring: experiments with sentence ordering

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Revisions that improve cohesion in multi-document summaries: a preliminary study

AS '02 Proceedings of the ACL-02 Workshop on Automatic Summarization - Volume 4
Incorporating non-local information into information extraction systems by Gibbs sampling

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Discourse generation using utility-trained coherence models

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
DUC in context

Information Processing and Management: an International Journal
Two uses of anaphora resolution in summarization

Information Processing and Management: an International Journal
Modeling local coherence: An entity-based approach

Computational Linguistics
Coreference-inspired coherence modeling

HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
Mind the gap: dangers of divorcing evaluations of summary content from linguistic quality

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Predicting the fluency of text with shallow structural features: case studies of machine translation and human-written text

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
EM works for pronoun anaphora resolution

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Further meta-evaluation of machine translation

StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
Automatic evaluation of text coherence: models and representations

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
A classification algorithm for predicting the structure of summaries

UCNLG+Sum '09 Proceedings of the 2009 Workshop on Language Generation and Summarisation

Using bilingual information for cross-language document summarization

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Extending the entity grid with entity-specific features

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Overview of the INEX 2010 question answering track (QA@INEX)

INEX'10 Proceedings of the 9th international conference on Initiative for the evaluation of XML retrieval: comparative evaluation of focused retrieval
Resolving ambiguity in biomedical text to improve summarization

Information Processing and Management: an International Journal
Text stream processing

Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics
Automatic metrics for genre-specific text quality

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Student Research Workshop
Modeling coherence in ESOL learner texts

Proceedings of the Seventh Workshop on Building Educational Applications Using NLP
Discourse structure and computation: past, present and future

ACL '12 Proceedings of the ACL-2012 Special Workshop on Rediscovering 50 Years of Discoveries
Tweet recommendation with graph co-ranking

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Combining coherence models and machine translation evaluation metrics for summarization evaluation

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Offline sentence processing measures for testing readability with users

PITR '12 Proceedings of the First Workshop on Predicting and Improving Text Readability for target reader populations
An assessment of the accuracy of automatic evaluation in summarization

Proceedings of Workshop on Evaluation Metrics and System Comparison for Automatic Summarization
Summary evaluation: together we stand NPowER-ed

CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

To date, few attempts have been made to develop and validate methods for automatic evaluation of linguistic quality in text summarization. We present the first systematic assessment of several diverse classes of metrics designed to capture various aspects of well-written text. We train and test linguistic quality models on consecutive years of NIST evaluation data in order to show the generality of results. For grammaticality, the best results come from a set of syntactic features. Focus, coherence and referential clarity are best evaluated by a class of features measuring local coherence on the basis of cosine similarity between sentences, coreference information, and summarization specific features. Our best results are 90% accuracy for pairwise comparisons of competing systems over a test set of several inputs and 70% for ranking summaries of a specific input.