Reranking bilingually extracted paraphrases using monolingual distributional similarity

Authors:
Tsz Ping Chan;Chris Callison-Burch;Benjamin Van Durme
Affiliations:
Johns Hopkins University;Johns Hopkins University;Johns Hopkins University
Venue:
GEMS '11 Proceedings of the GEMS 2011 Workshop on GEometrical Models of Natural Language Semantics
Year:
2011

Citing 21
Cited 5

Word association norms, mutual information, and lexicography

Computational Linguistics
Approximate nearest neighbors: towards removing the curse of dimensionality

STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Similarity estimation techniques from rounding algorithms

STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Optimizing search engines using clickthrough data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Discovery of inference rules for question-answering

Natural Language Engineering
Using syntactic dependency as local context to resolve word sense ambiguity

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Extracting paraphrases from a parallel corpus

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Syntax-based alignment of multiple translations: extracting paraphrases and generating new sentences

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Web-based models for natural language processing

ACM Transactions on Speech and Language Processing (TSLP)
Paraphrasing with bilingual parallel corpora

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Randomized algorithms and NLP: using locality sensitive hash function for high speed noun clustering

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Syntactic constraints on paraphrases extracted from parallel corpora

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Computing word-pair antonymy

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Identifying synonyms among distributionally similar words

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Syntax augmented machine translation via chart parsing

StatMT '06 Proceedings of the Workshop on Statistical Machine Translation
TER-Plus: paraphrase, semantic, and alignment enhancements to Translation Edit Rate

Machine Translation
Online generation of locality sensitive hash signatures

ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Generating phrasal and sentential paraphrases: A survey of data-driven methods

Computational Linguistics
Joshua 3.0: syntax-based machine translation with the Thrax grammar extractor

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Aligning needles in a haystack: paraphrase acquisition across the web

IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing

Monolingual distributional similarity for text-to-text generation

SemEval '12 Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation
PREFER: using a graph-based approach to generate paraphrases for language learning

Proceedings of the Seventh Workshop on Building Educational Applications Using NLP
Enlarging paraphrase collections through generalization and instantiation

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Constructing a textual KB from a biology TextBook

AKBC-WEKEX '12 Proceedings of the Joint Workshop on Automatic Knowledge Base Construction and Web-scale Knowledge Extraction
A study of the knowledge base requirements for passing an elementary science test

Proceedings of the 2013 workshop on Automated knowledge base construction

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper improves an existing bilingual paraphrase extraction technique using monolingual distributional similarity to rerank candidate paraphrases. Raw monolingual data provides a complementary and orthogonal source of information that lessens the commonly observed errors in bilingual pivot-based methods. Our experiments reveal that monolingual scoring of bilingually extracted paraphrases has a significantly stronger correlation with human judgment for grammaticality than the probabilities assigned by the bilingual pivoting method does. The results also show that monolingual distribution similarity can serve as a threshold for high precision paraphrase selection.