Paraphrasing with bilingual parallel corpora

  • Authors:
  • Colin Bannard;Chris Callison-Burch

  • Affiliations:
  • University of Edinburgh, Edinburgh;University of Edinburgh, Edinburgh

  • Venue:
  • ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Previous work has used monolingual parallel corpora to extract and generate paraphrases. We show that this task can be done using bilingual parallel corpora, a much more commonly available resource. Using alignment techniques from phrase-based statistical machine translation, we show how paraphrases in one language can be identified using a phrase in another language as a pivot. We define a paraphrase probability that allows paraphrases extracted from a bilingual parallel corpus to be ranked using translation probabilities, and show how it can be refined to take contextual information into account. We evaluate our paraphrase extraction and ranking methods using a set of manual word alignments, and contrast the quality with paraphrases extracted from automatic alignments.