Satisfying privacy requirements: one step before anonymization

  • Authors:
  • Xiaoxun Sun;Hua Wang;Jiuyong Li

  • Affiliations:
  • Department of Mathematics & Computing, University of Southern Queensland, Australia;Department of Mathematics & Computing, University of Southern Queensland, Australia;School of Computer and Information Science, University of South Australia, Australia

  • Venue:
  • PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we study a problem of privacy protection in large survey rating data. The rating data usually contains both ratings of sensitive and non-sensitive issues, and the ratings of sensitive issues include personal information. Even when survey participants do not reveal any of their ratings, their survey records are potentially identifiable by using information from other public sources. We propose a new (k,ε,l)-anonymity model, in which each record is required to be similar with at least k−1 others based on the non-sensitive ratings, where the similarity is controlled by ε, and the standard deviation of sensitive ratings is at least l. We study an interesting yet nontrivial satisfaction problem of the (k,ε,l)-anonymity, which is to decide whether a survey rating data set satisfies the privacy requirements given by users. We develop a slice technique for the satisfaction problem and the experimental results show that the slicing technique is fast, scalable and much more efficient in terms of execution time than the heuristic pairwise method.