Monolingual distributional similarity for text-to-text generation

Authors:
Juri Ganitkevitch;Benjamin Van Durme;Chris Callison-Burch
Affiliations:
Center for Language and Speech Processing Human Language Technology Center of Excellence Johns Hopkins University, Baltimore, MD;Center for Language and Speech Processing Human Language Technology Center of Excellence Johns Hopkins University, Baltimore, MD;Center for Language and Speech Processing Human Language Technology Center of Excellence Johns Hopkins University, Baltimore, MD
Venue:
SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
Year:
2012

Citing 28
Cited 1

Word association norms, mutual information, and lexicography

Computational Linguistics
Approximate nearest neighbors: towards removing the curse of dimensionality

STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
The paraphrase search assistant: terminological feedback for iterative information seeking

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Similarity estimation techniques from rounding algorithms

STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
The theory of parsing, translation, and compiling

The theory of parsing, translation, and compiling
Information fusion for multidocument summarization: paraphrasing and generation

Information fusion for multidocument summarization: paraphrasing and generation
Paraphrasing using given and new information in a question-answer system

ACL '79 Proceedings of the 17th annual meeting on Association for Computational Linguistics
Information fusion in the context of multi-document summarization

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Learning surface text patterns for a Question Answering system

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Web-based models for natural language processing

ACM Transactions on Speech and Language Processing (TSLP)
A hierarchical phrase-based model for statistical machine translation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Paraphrasing with bilingual parallel corpora

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Randomized algorithms and NLP: using locality sensitive hash function for high speed noun clustering

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Alignment by agreement

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Sentence compression beyond word deletion

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Syntactic constraints on paraphrases extracted from parallel corpora

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Global inference for sentence compression an integer linear programming approach

Journal of Artificial Intelligence Research
Application-driven statistical paraphrase generation

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Statistical Machine Translation

Statistical Machine Translation
Online generation of locality sensitive hash signatures

ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Paraphrastic sentence compression with a character-based metric: tightening without deletion

MTTG '11 Proceedings of the Workshop on Monolingual Text-To-Text Generation
Reranking bilingually extracted paraphrases using monolingual distributional similarity

GEMS '11 Proceedings of the GEMS 2011 Workshop on GEometrical Models of Natural Language Semantics
Learning sentential paraphrases from bilingual parallel corpora for text-to-text generation

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Tuning as ranking

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Expectations of word sense in parallel corpora

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Sentence simplification by monolingual machine translation

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Annotated Gigaword

AKBC-WEKEX '12 Proceedings of the Joint Workshop on Automatic Knowledge Base Construction and Web-scale Knowledge Extraction
Joshua 4.0: packing, PRO, and paraphrases

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation

Annotated Gigaword

AKBC-WEKEX '12 Proceedings of the Joint Workshop on Automatic Knowledge Base Construction and Web-scale Knowledge Extraction

Quantified Score

Hi-index	0.00

Visualization

Abstract

Previous work on paraphrase extraction and application has relied on either parallel datasets, or on distributional similarity metrics over large text corpora. Our approach combines these two orthogonal sources of information and directly integrates them into our paraphrasing system's log-linear model. We compare different distributional similarity feature-sets and show significant improvements in grammaticality and meaning retention on the example text-to-text generation task of sentence compression, achieving state-of-the-art quality.