Learning to rank with SoftRank and Gaussian processes

Authors:
John Guiver;Edward Snelson
Affiliations:
Microsoft Research Limited, Cambridge, United Kngdm;Microsoft Research Limited, Cambridge, United Kngdm
Venue:
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Year:
2008

Citing 10
Cited 17

IR evaluation methods for retrieving highly relevant documents

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Bayesian Learning for Neural Networks

Bayesian Learning for Neural Networks
Optimizing search engines using clickthrough data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Simple BM25 extension to multiple weighted fields

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Gaussian Processes for Ordinal Regression

The Journal of Machine Learning Research
Learning to rank using gradient descent

ICML '05 Proceedings of the 22nd international conference on Machine learning
Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)

Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)
A Unifying View of Sparse Approximate Gaussian Process Regression

The Journal of Machine Learning Research
On rank-based effectiveness measures and optimization

Information Retrieval
SoftRank: optimizing non-smooth rank metrics

WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining

Robust sparse rank learning for non-smooth ranking measures

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Learning Preferences with Hidden Common Cause Relations

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Learning to Rank for Information Retrieval

Foundations and Trends in Information Retrieval
Multi-relational learning with Gaussian processes

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
A Boosting Approach for Learning to Rank Using SVD with Partially Labeled Data

AIRS '09 Proceedings of the 5th Asia Information Retrieval Symposium on Information Retrieval Technology
Gradient descent optimization of smoothed information retrieval metrics

Information Retrieval
A general approximation framework for direct optimization of information retrieval measures

Information Retrieval
Temporal query log profiling to improve web search ranking

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Learning to blend rankings: a monotonic transformation to blend rankings from heterogeneous domains

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Fast active exploration for link-based preference learning using Gaussian processes

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
Tendency correlation analysis for direct optimization of evaluation measures in information retrieval

Information Retrieval
Ranking continuous probabilistic datasets

Proceedings of the VLDB Endowment
Learning to re-rank web search results with multiple pairwise features

Proceedings of the fourth ACM international conference on Web search and data mining
Multi-task learning to rank for web search

Pattern Recognition Letters
Leveraging Auxiliary Data for Learning to Rank

ACM Transactions on Intelligent Systems and Technology (TIST)
Pairwise cross-domain factor model for heterogeneous transfer ranking

Proceedings of the fifth ACM international conference on Web search and data mining
Forest reranking through subtree ranking

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we address the issue of learning to rank for document retrieval using Thurstonian models based on sparse Gaussian processes. Thurstonian models represent each document for a given query as a probability distribution in a score space; these distributions over scores naturally give rise to distributions over document rankings. However, in general we do not have observed rankings with which to train the model; instead, each document in the training set is judged to have a particular relevance level: for example "Bad", "Fair", "Good", or "Excellent". The performance of the model is then evaluated using information retrieval (IR) metrics such as Normalised Discounted Cumulative Gain (NDCG). Recently Taylor et al. presented a method called SoftRank which allows the direct gradient optimisation of a smoothed version of NDCG using a Thurstonian model. In this approach, document scores are represented by the outputs of a neural network, and score distributions are created artificially by adding random noise to the scores. The SoftRank mechanism is a general one; it can be applied to different IR metrics, and make use of different underlying models. In this paper we extend the SoftRank framework to make use of the score uncertainties which are naturally provided by a Gaussian process (GP), which is a probabilistic non-linear regression model. We further develop the model by using sparse Gaussian process techniques, which give improved performance and efficiency, and show competitive results against baseline methods when tested on the publicly available LETOR OHSUMED data set. We also explore how the available uncertainty information can be used in prediction and how it affects model performance.