Security-control methods for statistical databases: a comparative study
ACM Computing Surveys (CSUR)
Introduction to statistical pattern recognition (2nd ed.)
Introduction to statistical pattern recognition (2nd ed.)
Signals & systems (2nd ed.)
Privacy-preserving data mining
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
On the design and quantification of privacy preserving data mining algorithms
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Limiting privacy breaches in privacy preserving data mining
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Privacy preserving mining of association rules
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
On the Privacy Preserving Properties of Random Data Perturbation Techniques
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
State-of-the-art in privacy preserving data mining
ACM SIGMOD Record
Convex Optimization
IEEE Transactions on Knowledge and Data Engineering
Journal of Computational Methods in Sciences and Engineering - Computational and Mathematical Methods for Science and Engineering Conference 2002 - CMMSE-2002
Hi-index | 0.00 |
Data perturbation with random noise signals has been shown to be useful for data hiding in privacy-preserving data mining. Perturbation methods based on additive randomization allows accurate estimation of the Probability Density Function (PDF) via the Expectation-Maximization (EM) algorithm but it has been shown that noise-filtering techniques can be used to reconstruct the original data in many cases, leading to security breaches. In this paper, we propose a genericPDF reconstruction algorithm that can be used on non-additive (and additive) randomization techiques for the purpose of privacy-preserving data mining. This two-step reconstruction algorithm is based on Parzen-Window reconstruction and Quadratic Programming over a convex set --- the probability simplex. Our algorithm eliminates the usual need for the iterative EM algorithm and it is generic for most randomization models. The simplicity of our two-step reconstruction algorithm, without iteration, also makes it attractive for use when dealing with large datasets.