Extending query translation to cross-language query expansion with markov chain models

  • Authors:
  • Guihong Cao;Jianfeng Gao;Jian-Yun Nie;Jing Bai

  • Affiliations:
  • Université de Montréal, Montréal, PQ, Canada;Microsoft Research, Redmond, WA;Université de Montréal, Montréal, PQ, Canada;Université de Montréal, Montréal, PQ, Canada

  • Venue:
  • Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Dictionary-based approaches to query translation have been widely used in Cross-Language Information Retrieval (CLIR) experiments. However, translation has been not only limited by the coverage of the dictionary, but also affected by translation ambiguities. In this paper we propose a novel method of query translation that combines other types of term relation to complement the dictionary-based translation. This allows extending the literal query translation to related words, which produce a beneficial effect of query expansion in CLIR. In this paper, we model query translation by Markov Chains (MC), where query translation is viewed as a process of expanding query terms to their semantically similar terms in a different language. In MC, terms and their relationships are modeled as a directed graph, and query translation is performed as a random walk in the graph, which propagates probabilities to related terms. This framework allows us to incorporating different types of term relation, either between two languages or within the source or target languages. In addition, the iterative training process of MC allows us to attribute higher probabilities to the target terms more related to the original query, thus offers a solution to the translation ambiguity problem. We evaluated our method on three CLIR benchmark collections, and obtained significant improvements over traditional dictionary-based approaches.