Flexible sample selection strategies for transfer learning in ranking

Authors:
Kevin Duh;Akinori Fujino
Affiliations:
NTT Communication Science Laboratories, 2-4 Hikaridai, Keihanna Science City, Kyoto 619-0237, Japan;NTT Communication Science Laboratories, 2-4 Hikaridai, Keihanna Science City, Kyoto 619-0237, Japan
Venue:
Information Processing and Management: an International Journal
Year:
2012

Citing 20
Cited 1

IR evaluation methods for retrieving highly relevant documents

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
An efficient boosting algorithm for combining preferences

The Journal of Machine Learning Research
Learning to rank using gradient descent

ICML '05 Proceedings of the 22nd international conference on Machine learning
Training linear SVMs in linear time

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Discriminative learning for differing training and test distributions

Proceedings of the 24th international conference on Machine learning
AdaRank: a boosting algorithm for information retrieval

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Query dependent ranking using K-nearest neighbor

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Trada: tree based ranking function adaptation

Proceedings of the 17th ACM conference on Information and knowledge management
TransRank: A Novel Algorithm for Transfer of Rank Learning

ICDMW '08 Proceedings of the 2008 IEEE International Conference on Data Mining Workshops
Ranking model adaptation for domain-specific search

Proceedings of the 18th ACM conference on Information and knowledge management
Expected reciprocal rank for graded relevance

Proceedings of the 18th ACM conference on Information and knowledge management
Heterogeneous cross domain ranking in latent space

Proceedings of the 18th ACM conference on Information and knowledge management
A risk minimization framework for domain adaptation

Proceedings of the 18th ACM conference on Information and knowledge management
Model adaptation via model interpolation and boosting for web search ranking

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
A Least-squares Approach to Direct Importance Estimation

The Journal of Machine Learning Research
Multi-task learning for boosting with application to web search ranking

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
LETOR: A benchmark collection for research on learning to rank for information retrieval

Information Retrieval
Semi-supervised ranking for document retrieval

Computer Speech and Language
Statistical outlier detection using direct density ratio estimation

Knowledge and Information Systems
Subset ranking using regression

COLT'06 Proceedings of the 19th annual conference on Learning Theory

Cross-task crowdsourcing

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Ranking is a central component in information retrieval systems; as such, many machine learning methods for building rankers have been developed in recent years. An open problem is transfer learning, i.e. how labeled training data from one domain/market can be used to build rankers for another. We propose a flexible transfer learning strategy based on sample selection. Source domain training samples are selected if the functional relationship between features and labels do not deviate much from that of the target domain. This is achieved through a novel application of recent advances from density ratio estimation. The approach is flexible, scalable, and modular. It allows many existing supervised rankers to be adapted to the transfer learning setting. Results on two datasets (Yahoo's Learning to Rank Challenge and Microsoft's LETOR data) show that the proposed method gives robust improvements.