Searching for Better Randomized Response Schemes for Privacy-Preserving Data Mining

  • Authors:
  • Zhengli Huang;Wenliang Du;Zhouxuan Teng

  • Affiliations:
  • Department of Electrical Engineering and Computer Science, Syracuse University, Syracuse, NY 13244, U.S.A.;Department of Electrical Engineering and Computer Science, Syracuse University, Syracuse, NY 13244, U.S.A.;Department of Electrical Engineering and Computer Science, Syracuse University, Syracuse, NY 13244, U.S.A.

  • Venue:
  • PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

To preserve user privacy in Privacy-Preserving Data Mining (PPDM), the randomized response (RR) technique is widely used for categorical data. Although various RR schemes have been proposed, there is no study to systematically compare them in order to find optimal RR schemes. In the paper, we choose the R-U (Risk-Utility) confidentiality map to compare different randomization schemes. Using the R-U map as our metric, we present an optimal RR scheme for binary data, which helps us find an optimal class of RR matrices. From this optimal scheme, we have discovered several heuristic rules among the elements in the optimal class. We generalize these rules to find optimal class of RR matrices for categorical data. Based on these rules, we propose an RR scheme to find a class of RR matrices for categorical data. Our experimental results have shown that our scheme has much better performance than the existing RR schemes.