Discriminative factored prior models for personalized content-based recommendation

  • Authors:
  • Lanbo Zhang;Yi Zhang

  • Affiliations:
  • University of California, Santa Cruz, Santa Cruz, CA, USA;University of California, Santa Cruz, Santa Cruz, CA, USA

  • Venue:
  • CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Most existing content-based filtering approaches including Rocchio, Language Models, SVM, Logistic Regression, Neural Networks, etc. learn user profiles independently without capturing the similarity among users. The Bayesian hierarchical models learn user profiles jointly and have the advantage of being able to borrow information from other users through a Bayesian prior. The standard Bayesian hierarchical model assumes all user profiles are generated from the same prior. However, considering the diversity of user interests, this assumption might not be optimal. Besides, most existing content-based filtering approaches implicitly assume that each user profile corresponds to exactly one user interest and fail to capture a user's multiple interests (information needs). In this paper, we present a flexible Bayesian hierarchical modeling approach to model both commonality and diversity among users as well as individual users' multiple interests. We propose two models each with different assumptions, and the proposed models are called Discriminative Factored Prior Models (DFPM). In our models, each user profile is modeled as a discriminative classifier with a factored model as its prior, and different factors contribute in different levels to each user profile. Compared with existing content-based filtering models, DFPM are interesting because they can 1) borrow discriminative criteria of other users while learning a particular user profile through the factored prior; 2) trade off well between diversity and commonality among users; and 3) handle the challenging classification situation where each class contains multiple concepts. The experimental results on a dataset collected from real users on digg.com show that our models significantly outperform the baseline models of L-2 regularized logistic regression and the standard Bayesian hierarchical model with logistic regression