Subset ranking using regression

  • Authors:
  • David Cossock;Tong Zhang

  • Affiliations:
  • Yahoo Inc., Santa Clara, CA;Yahoo Inc., New York City

  • Venue:
  • COLT'06 Proceedings of the 19th annual conference on Learning Theory
  • Year:
  • 2006

Quantified Score

Hi-index 0.02

Visualization

Abstract

We study the subset ranking problem, motivated by its important application in web-search. In this context, we consider the standard DCG criterion (discounted cumulated gain) that measures the quality of items near the top of the rank-list. Similar to error minimization for binary classification, the DCG criterion leads to a non-convex optimization problem that can be NP-hard. Therefore a computationally more tractable approach is needed. We present bounds that relate the approximate optimization of DCG to the approximate minimization of certain regression errors. These bounds justify the use of convex learning formulations for solving the subset ranking problem. The resulting estimation methods are not conventional, in that we focus on the estimation quality in the top-portion of the rank-list. We further investigate the generalization ability of these formulations. Under appropriate conditions, the consistency of the estimation schemes with respect to the DCG metric can be derived.