Collaborative Filtering with Maximum Entropy

  • Authors:
  • Dmitry Pavlov;Eren Manavoglu;David M. Pennock;C. Lee Giles

  • Affiliations:
  • Yahoo;Pennsylvania State University;Yahoo Research Labs;Pennsylvania State University

  • Venue:
  • IEEE Intelligent Systems
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

The authors describe a novel maximum-entropy (maxent) approach for generating online recommendations as a user navigates through a collection of documents. They show how to handle high-dimensional sparse data and represent it as a collection of ordered sequences of document requests. This representation and the maxent approach have several advantages: (1) you can naturally model long-term interactions and dependencies in the data sequences; (2) you can query the model quickly once it is learned, which makes the method applicable to high-volume Web servers; and (3) you obtain empirically high-quality recommendations. Although maxent learning is computationally infeasible if implemented in the straightforward way, the authors explored data clustering and several algorithmic techniques to make learning practical even in high dimensions. They present several methods for combining the predictions of maxent models learned in different clusters. They conducted offline tests using over six months' worth of data from ResearchIndex, a popular online repository of over 470,000 computer science documents. They show that their maxent algorithm is one of the most accurate recommenders, as compared to such techniques as correlation, a mixture of Markov models, a mixture of multinomial models, individual similarity-based recommenders currently available on ResearchIndex, and even various combinations of current ResearchIndex recommenders.