Probabilistic relevance ranking for collaborative filtering

  • Authors:
  • Jun Wang;Stephen Robertson;Arjen P. Vries;Marcel J. Reinders

  • Affiliations:
  • University College London, Ipswich, UK IP5 3RE;Microsoft Research, Cambridge, UK;CWI, Amsterdam, The Netherlands;Delft University of Technology, Delft, The Netherlands

  • Venue:
  • Information Retrieval
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

Collaborative filtering is concerned with making recommendations about items to users. Most formulations of the problem are specifically designed for predicting user ratings, assuming past data of explicit user ratings is available. However, in practice we may only have implicit evidence of user preference; and furthermore, a better view of the task is of generating a top-N list of items that the user is most likely to like. In this regard, we argue that collaborative filtering can be directly cast as a relevance ranking problem. We begin with the classic Probability Ranking Principle of information retrieval, proposing a probabilistic item ranking framework. In the framework, we derive two different ranking models, showing that despite their common origin, different factorizations reflect two distinctive ways to approach item ranking. For the model estimations, we limit our discussions to implicit user preference data, and adopt an approximation method introduced in the classic text retrieval model (i.e. the Okapi BM25 formula) to effectively decouple frequency counts and presence/absence counts in the preference data. Furthermore, we extend the basic formula by proposing the Bayesian inference to estimate the probability of relevance (and non-relevance), which largely alleviates the data sparsity problem. Apart from a theoretical contribution, our experiments on real data sets demonstrate that the proposed methods perform significantly better than other strong baselines.