Acquiring translation equivalences of multiword expressions by normalized correlation frequencies

  • Authors:
  • Ming-Hong Bai;Jia-Ming You;Keh-Jiann Chen;Jason S. Chang

  • Affiliations:
  • Academia Sinica, Taiwan and National Tsing-Hua University, Taiwan;Academia Sinica, Taiwan;Academia Sinica, Taiwan;National Tsing-Hua University, Taiwan

  • Venue:
  • EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we present an algorithm for extracting translations of any given multiword expression from parallel corpora. Given a multiword expression to be translated, the method involves extracting a short list of target candidate words from parallel corpora based on scores of normalized frequency, generating possible translations and filtering out common subsequences, and selecting the top-n possible translations using the Dice coefficient. Experiments show that our approach outperforms the word alignment-based and other naive association-based methods. We also demonstrate that adopting the extracted translations can significantly improve the performance of the Moses machine translation system.