Models for sentence compression: a comparison across domains, training requirements and evaluation measures

Authors:
James Clarke;Mirella Lapata
Affiliations:
University of Edinburgh, Edinburgh, UK;University of Edinburgh, Edinburgh, UK
Venue:
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Year:
2006

Citing 11
Cited 23

Computer systems that learn: classification and prediction methods from statistics, neural nets, machine learning, and expert systems

Computer systems that learn: classification and prediction methods from statistics, neural nets, machine learning, and expert systems
Numerical recipes in C (2nd ed.): the art of scientific computing

Numerical recipes in C (2nd ed.): the art of scientific computing
C4.5: programs for machine learning

C4.5: programs for machine learning
Summarization beyond sentence extraction: a probabilistic approach to sentence compression

Artificial Intelligence
Sentence reduction for automatic text summarization

ANLC '00 Proceedings of the sixth conference on Applied natural language processing
A maximum-entropy-inspired parser

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Example-based sentence reduction using the hidden markov model

ACM Transactions on Asian Language Information Processing (TALIP)
Statistical sentence condensation using ambiguity packing and stochastic disambiguation methods for Lexical-Functional Grammar

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Evaluation metrics for generation

INLG '00 Proceedings of the first international conference on Natural language generation - Volume 14
Supervised and unsupervised learning for sentence compression

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Probabilistic sentence reduction using support vector machines

COLING '04 Proceedings of the 20th international conference on Computational Linguistics

Multi-candidate reduction: Sentence compression as a tool for document summarization tasks

Information Processing and Management: an International Journal
Text Editing for Lecture Speech Archiving on the Web

ICCPOL '09 Proceedings of the 22nd International Conference on Computer Processing of Oriental Languages. Language Technology for the Knowledge-based Economy
Predicting the fluency of text with shallow structural features: case studies of machine translation and human-written text

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Is sentence compression an NLG task?

ENLG '09 Proceedings of the 12th European Workshop on Natural Language Generation
Evaluating the syntactic transformations in gold standard corpora for statistical sentence compression

NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Global inference for sentence compression an integer linear programming approach

Journal of Artificial Intelligence Research
Sentence compression as tree transduction

Journal of Artificial Intelligence Research
A syntax-free approach to Japanese sentence compression

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
A comparison of model free versus model intensive approaches to sentence compression

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
A parse-and-trim approach with information significance for Chinese sentence compression

UCNLG+Sum '09 Proceedings of the 2009 Workshop on Language Generation and Summarisation
Dependency tree based sentence compression

INLG '08 Proceedings of the Fifth International Natural Language Generation Conference
A study of global inference algorithms in multi-document summarization

ECIR'07 Proceedings of the 29th European conference on IR research
Using topic themes for multi-document summarization

ACM Transactions on Information Systems (TOIS)
An extractive supervised two-stage method for sentence compression

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Bayesian synchronous tree-substitution grammar induction and its application to sentence compression

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Soylent: a word processor with a crowd inside

UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology
On the limits of sentence compression by deletion

Empirical methods in natural language generation
Structural features for predicting the linguistic quality of text: applications to machine translation, automatic summarization and human-authored text

Empirical methods in natural language generation
Discourse constraints for document compression

Computational Linguistics
Simple English Wikipedia: a new text simplification task

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Learning to simplify sentences using Wikipedia

MTTG '11 Proceedings of the Workshop on Monolingual Text-To-Text Generation
Evaluating sentence compression: pitfalls and suggested remedies

MTTG '11 Proceedings of the Workshop on Monolingual Text-To-Text Generation
A new sentence compression dataset and its use in an abstractive generate-and-rank sentence compressor

UCNLG+EVAL '11 Proceedings of the UCNLG+Eval: Language Generation and Evaluation Workshop

Quantified Score

Hi-index	0.00

Visualization

Abstract

Sentence compression is the task of producing a summary at the sentence level. This paper focuses on three aspects of this task which have not received detailed treatment in the literature: training requirements, scalability, and automatic evaluation. We provide a novel comparison between a supervised constituent-based and an weakly supervised word-based compression algorithm and examine how these models port to different domains (written vs. spoken text). To achieve this, a human-authored compression corpus has been created and our study highlights potential problems with the automatically gathered compression corpora currently used. Finally, we assess whether automatic evaluation measures can be used to determine compression quality.