Estimating translation probabilities from the web for structured queries on CLIR

  • Authors:
  • Xabier Saralegi;Maddalen Lopez de Lacalle

  • Affiliations:
  • Elhuyar Foundation, R S D, Usurbil, Spain;Elhuyar Foundation, R S D, Usurbil, Spain

  • Venue:
  • ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present two methods for estimating replacement probabilities without using parallel corpora. The first method proposed exploits the possible translation probabilities latent in Machine Readable Dictionaries (MRD). The second method is more robust, and exploits context similarity-based techniques in order to estimate word translation probabilities using the Internet as a bilingual comparable corpus. The experiments show a statistically significant improvement over non weighted structured queries in terms of MAP by using the replacement probabilities obtained with the proposed methods. The context similarity-based method is the one that yields the most significant improvement.