Towards publishing recommendation data with predictive anonymization

  • Authors:
  • Chih-Cheng Chang;Brian Thompson;Hui (Wendy) Wang;Danfeng Yao

  • Affiliations:
  • Rutgers University, Piscataway, NJ;Rutgers University, Piscataway, NJ;Stevens Institute of Technology, Hoboken, NJ;Rutgers University, Piscataway, NJ

  • Venue:
  • ASIACCS '10 Proceedings of the 5th ACM Symposium on Information, Computer and Communications Security
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recommender systems are used to predict user preferences for products or services. In order to seek better prediction techniques, data owners of recommender systems such as Netflix sometimes make their customers' reviews available to the public, which raises serious privacy concerns. With only a small amount of knowledge about individuals and their ratings to some items in a recommender system, an adversary may easily identify the users and breach their privacy. Unfortunately, most of the existing privacy models (e.g., k-anonymity) cannot be directly applied to recommender systems. In this paper, we study the problem of privacy-preserving publishing of recommendation datasets. We represent recommendation data as a bipartite graph, and identify several attacks that can re-identify users and determine their item ratings. To deal with these attacks, we first give formal privacy definitions for recommendation data, and then develop a robust and efficient anonymization algorithm, Predictive Anonymization, to achieve our privacy goals. Our experimental results show that Predictive Anonymization can prevent the attacks with very little impact to prediction accuracy.