Analyzing Consumer-Product Graphs: Empirical Findings and Applications in Recommender Systems

  • Authors:
  • Zan Huang;Daniel D. Zeng;Hsinchun Chen

  • Affiliations:
  • Department of Supply Chain and Information Systems, Pennsylvania State University, 419 Business Building, University Park, Pennsylvania 16802;Department of Management Information Systems, The University of Arizona, McClelland Hall 430, 1130 East Helen Street, Tucson, Arizona 85721;Department of Management Information Systems, The University of Arizona, McClelland Hall 430, 1130 East Helen Street, Tucson, Arizona 85721

  • Venue:
  • Management Science
  • Year:
  • 2007

Quantified Score

Hi-index 0.02

Visualization

Abstract

We apply random graph modeling methodology to analyze bipartite consumer-product graphs that represent sales transactions to better understand consumer purchase behavior in e-commerce settings. Based on two real-world e-commerce data sets, we found that such graphs demonstrate topological features that deviate significantly from theoretical predictions based on standard random graph models. In particular, we observed consistently larger-than-expected average path lengths and a greater-than-expected tendency to cluster. Such deviations suggest that the consumers' product choices are not random even with the consumer and product attributes hidden. Our findings provide justification for a large family of collaborative filtering-based recommendation algorithms that make product recommendations based only on previous sales transactions. By analyzing the simulated consumer-product graphs generated by models that embed two representative recommendation algorithms, we found that these recommendation algorithm-induced graphs generally provided a better match with the real-world consumer-product graphs than purely random graphs. However, consistent deviations in topological features remained. These findings motivated the development of a new recommendation algorithm based on graph partitioning, which aims to achieve high clustering coefficients similar to those observed in the real-world e-commerce data sets. We show empirically that this algorithm significantly outperforms representative collaborative filtering algorithms in situations where the observed clustering coefficients of the consumer-product graphs are sufficiently larger than can be accounted for by these standard algorithms.