Ranking function adaptation with boosting trees

Authors:
Keke Chen;Jing Bai;Zhaohui Zheng
Affiliations:
Wright State University;Microsoft;Yahoo! Labs
Venue:
ACM Transactions on Information Systems (TOIS)
Year:
2011

Citing 35
Cited 0

The nature of statistical learning theory

The nature of statistical learning theory
IR evaluation methods for retrieving highly relevant documents

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
Modern Information Retrieval

Modern Information Retrieval
Optimizing search engines using clickthrough data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
An efficient boosting algorithm for combining preferences

The Journal of Machine Learning Research
Discriminative models for information retrieval

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Regularized multi--task learning

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Improving SVM accuracy by training on auxiliary data sources

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Supervised grammar induction using training data with limited constituent information

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Accurately interpreting clickthrough data as implicit feedback

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Learning to rank using gradient descent

ICML '05 Proceedings of the 22nd international conference on Machine learning
Logistic regression with an auxiliary data source

ICML '05 Proceedings of the 22nd international conference on Machine learning
Improving web search ranking by incorporating user behavior information

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data

The Journal of Machine Learning Research
Learning to rank: from pairwise approach to listwise approach

Proceedings of the 24th international conference on Machine learning
Boosting for transfer learning

Proceedings of the 24th international conference on Machine learning
Self-taught learning: transfer learning from unlabeled data

Proceedings of the 24th international conference on Machine learning
A regression framework for learning ranking functions using relative relevance judgments

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
FRank: a ranking method with fidelity loss

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
AdaRank: a boosting algorithm for information retrieval

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Self-taught clustering

Proceedings of the 25th international conference on Machine learning
Transferred Dimensionality Reduction

ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
LIBLINEAR: A Library for Large Linear Classification

The Journal of Machine Learning Research
Trada: tree based ranking function adaptation

Proceedings of the 17th ACM conference on Information and knowledge management
A dynamic bayesian network click model for web search ranking

Proceedings of the 18th international conference on World wide web
Global ranking by exploiting user clicks

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Domain adaptation with structural correspondence learning

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Ranking model adaptation for domain-specific search

Proceedings of the 18th ACM conference on Information and knowledge management
A risk minimization framework for domain adaptation

Proceedings of the 18th ACM conference on Information and knowledge management
Model adaptation via model interpolation and boosting for web search ranking

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Empirical exploitation of click data for task specific ranking

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Adapting boosting for information retrieval measures

Information Retrieval
A Survey on Transfer Learning

IEEE Transactions on Knowledge and Data Engineering
Subset ranking using regression

COLT'06 Proceedings of the 19th annual conference on Learning Theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

Machine-learned ranking functions have shown successes in Web search engines. With the increasing demands on developing effective ranking functions for different search domains, we have seen a big bottleneck, that is, the problem of insufficient labeled training data, which has significantly slowed the development and deployment of machine-learned ranking functions for different domains. There are two possible approaches to address this problem: (1) combining labeled training data from similar domains with the small target-domain labeled data for training or (2) using pairwise preference data extracted from user clickthrough log for the target domain for training. In this article, we propose a new approach called tree-based ranking function adaptation (Trada) to effectively utilize these data sources for training cross-domain ranking functions. Tree adaptation assumes that ranking functions are trained with the Stochastic Gradient Boosting Trees method—a gradient boosting method on regression trees. It takes such a ranking function from one domain and tunes its tree-based structure with a small amount of training data from the target domain. The unique features include (1) automatic identification of the part of the model that needs adjustment for the new domain and (2) appropriate weighing of training examples considering both local and global distributions. Based on a novel pairwise loss function that we developed for pairwise learning, the basic tree adaptation algorithm is also extended (Pairwise Trada) to utilize the pairwise preference data from the target domain to further improve the effectiveness of adaptation. Experiments are performed on real datasets to show that tree adaptation can provide better-quality ranking functions for a new domain than other methods.