Why Does Collaborative Filtering Work? Transaction-Based Recommendation Model Validation and Selection by Analyzing Bipartite Random Graphs

  • Authors:
  • Zan Huang;Daniel Dajun Zeng

  • Affiliations:
  • Department of Supply Chain and Information Systems, Pennsylvania State University, University Park, Pennsylvania 16802;Department of Management Information Systems, University of Arizona, Tucson, Arizona 85721/ and Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China

  • Venue:
  • INFORMS Journal on Computing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

A large number of collaborative filtering algorithms have been proposed in the literature as the foundation of automated recommender systems. However, the underlying justification for these algorithms is lacking, and their relative performances are typically domain and data dependent. In this paper, we aim to develop initial understanding of the recommendation model/algorithm validation and selection issues based on the graph topological modeling methodology. By representing the input data in the form of consumer--product interactions as a bipartite graph, the consumer--product graph, we develop bipartite graph topological measures to capture patterns that exist in the input data relevant to the transaction-based recommendation task. We observe the deviations of these topological measures of real-world consumer--product graphs from the expected values for simulated random bipartite graphs. These deviations help explain why certain collaborative filtering algorithms work for particular recommendation data sets. They can also serve as the basis for a comprehensive model selection framework that “recommends” appropriate collaborative filtering algorithms given characteristics of the data set under study. We validate our approach using three real-world recommendation data sets and demonstrate the effectiveness of the proposed bipartite graph topological measures in selection and validation of commonly used heuristic-based recommendation algorithms, the user-based, item-based, and graph-based algorithms.