Mining Skewed and Sparse Transaction Data for Personalized Shopping Recommendation

  • Authors:
  • Chun-Nan Hsu;Hao-Hsiang Chung;Han-Shen Huang

  • Affiliations:
  • Institute of Information Science, Academia Sinica, Taiwan. chunnan@iis.sinica.edu.tw;Department of Computer Science and Information Engineering, National Taiwan University. r89057@csie.ntu.edu.tw;Department of Computer Science and Information Engineering, National Taiwan University. hanshen@iis.sinica.edu.tw

  • Venue:
  • Machine Learning
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

A good shopping recommender system can boost sales in a retailer store. To provide accurate recommendation, the recommender needs to accurately predict a customer's preference, an ability difficult to acquire. Conventional data mining techniques, such as association rule mining and collaborative filtering, can generally be applied to this problem, but rarely produce satisfying results due to the skewness and sparsity of transaction data. In this paper, we report the lessons that we learned in two real-world data mining applications for personalized shopping recommendation. We learned that extending a collaborative filtering method based on ratings (e.g., GroupLens) to perform personalized shopping recommendation is not trivial and that it is not appropriate to apply association-rule based methods (e.g., the IBM SmartPad system) for large scale prediction of customers' shopping preferences. Instead, a probabilistic graphical model can be more effective in handling skewed and sparse data. By casting collaborative filtering algorithms in a probabilistic framework, we derived HyPAM (Hybrid Poisson Aspect Modelling), a novel probabilistic graphical model for personalized shopping recommendation. Experimental results show that HyPAM outperforms GroupLens and the IBM method by generating much more accurate predictions of what items a customer will actually purchase in the unseen test data. The data sets and the results are made available for download at http://chunnan.iis.sinica.edu.tw/hypam/HyPAM.html.