Learning bidirectional asymmetric similarity for collaborative filtering via matrix factorization

  • Authors:
  • Bin Cao;Qiang Yang;Jian-Tao Sun;Zheng Chen

  • Affiliations:
  • The Hong Kong University of Science and Technology, Kowloon, Hong Kong;The Hong Kong University of Science and Technology, Kowloon, Hong Kong;Microsoft Research Asia, Beijing, China;Microsoft Research Asia, Beijing, China

  • Venue:
  • Data Mining and Knowledge Discovery
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Memory-based collaborative filtering (CF) aims at predicting the rating of a certain item for a particular user based on the previous ratings from similar users and/or similar items. Previous studies in finding similar users and items have several drawbacks. First, they are based on user-defined similarity measurements, such as Pearson Correlation Coefficient (PCC) or Vector Space Similarity (VSS), which are, for the most part, not adaptive and optimized for specific applications and data. Second, these similarity measures are restricted to symmetric ones such that the similarity between A and B is the same as that for B and A, although symmetry may not always hold in many real world applications. Third, they typically treat the similarity functions between users and functions between items separately. However, in reality, the similarities between users and between items are inter-related. In this paper, we propose a novel unified model for users and items, known as Similarity Learning based Collaborative Filtering (SLCF) , based on a novel adaptive bidirectional asymmetric similarity measurement. Our proposed model automatically learns asymmetric similarities between users and items at the same time through matrix factorization. Theoretical analysis shows that our model is a novel generalization of singular value decomposition (SVD). We show that, once the similarity relation is learned, it can be used flexibly in many ways for rating prediction. To take full advantage of the model, we propose several strategies to make the best use of the proposed similarity function for rating prediction. The similarity can be used either to improve the memory-based approaches or directly in a model based CF approaches. In addition, we also propose an online version of the rating prediction method to incorporate new users and new items. We evaluate SLCF using three benchmark datasets, including MovieLens, EachMovie and Netflix, through which we show that our methods can outperform many state-of-the-art baselines.