Knowledge transfer for cross domain learning to rank

  • Authors:
  • Depin Chen;Yan Xiong;Jun Yan;Gui-Rong Xue;Gang Wang;Zheng Chen

  • Affiliations:
  • School of Computer Science and Technology, University of Science and Technology of China, Hefei, China 230027;School of Computer Science and Technology, University of Science and Technology of China, Hefei, China 230027;Microsoft Research Asia, Beijing, China 100190;Shanghai Jiao Tong University, Shanghai, China 200240;Microsoft Research Asia, Beijing, China 100190;Microsoft Research Asia, Beijing, China 100190

  • Venue:
  • Information Retrieval
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recently, learning to rank technology is attracting increasing attention from both academia and industry in the areas of machine learning and information retrieval. A number of algorithms have been proposed to rank documents according to the user-given query using a human-labeled training dataset. A basic assumption behind general learning to rank algorithms is that the training and test data are drawn from the same data distribution. However, this assumption does not always hold true in real world applications. For example, it can be violated when the labeled training data become outdated or originally come from another domain different from its counterpart of test data. Such situations bring a new problem, which we define as cross domain learning to rank. In this paper, we aim at improving the learning of a ranking model in target domain by leveraging knowledge from the outdated or out-of-domain data (both are referred to as source domain data). We first give a formal definition of the cross domain learning to rank problem. Following this, two novel methods are proposed to conduct knowledge transfer at feature level and instance level, respectively. These two methods both utilize Ranking SVM as the basic learner. In the experiments, we evaluate these two methods using data from benchmark datasets for document retrieval. The results show that the feature-level transfer method performs better with steady improvements over baseline approaches across different datasets, while the instance-level transfer method comes out with varying performance depending on the dataset used.