Optimal randomization for privacy preserving data mining

  • Authors:
  • Yu Zhu;Lei Liu

  • Affiliations:
  • Purdue University, W. Lafayette, IN;Purdue University, W. Lafayette, IN

  • Venue:
  • Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Randomization is an economical and efficient approach for privacy preserving data mining (PPDM). In order to guarantee the performance of data mining and the protection of individual privacy, optimal randomization schemes need to be employed. This paper demonstrates the construction of optimal randomization schemes for privacy preserving density estimation. We propose a general framework for randomization using mixture models. The impact of randomization on data mining is quantified by performance degradation and mutual information loss, while privacy and privacy loss are quantified by interval-based metrics. Two different types of problems are defined to identify optimal randomization for PPDM. Illustrative examples and simulation results are reported.