Should we translate the documents or the queries in cross-language information retrieval?

  • Authors:
  • J. Scott McCarley

  • Affiliations:
  • IBM T.J. Watson Research Center, Yorktown Heights, NY

  • Venue:
  • ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

Previous comparisons of document and query translation suffered difficulty due to differing quality of machine translation in these two opposite directions. We avoid this difficulty by training identical statistical translation models for both translation directions using the same training data. We investigate information retrieval between English and French, incorporating both translations directions into both document translation and query translation-based information retrieval, as well as into hybrid systems. We find that hybrids of document and query translation-based systems out-perform query translation systems, even human-quality query translation systems.