Security-control methods for statistical databases: a comparative study
ACM Computing Surveys (CSUR)
Privacy-preserving data mining
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Independent component analysis: algorithms and applications
Neural Networks
Protecting Respondents' Identities in Microdata Release
IEEE Transactions on Knowledge and Data Engineering
Limiting privacy breaches in privacy preserving data mining
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
k-anonymity: a model for protecting privacy
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
On the Privacy Preserving Properties of Random Data Perturbation Techniques
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
IEEE Transactions on Knowledge and Data Engineering
Deriving private information from randomized data
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
IEEE Transactions on Knowledge and Data Engineering
Privacy Preserving Data Classification with Rotation Perturbation
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Anonymizing sequential releases
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
The VLDB Journal — The International Journal on Very Large Data Bases
A Tree-Based Data Perturbation Approach for Privacy-Preserving Data Mining
IEEE Transactions on Knowledge and Data Engineering
Data ShufflingA New Masking Approach for Numerical Data
Management Science
Maintaining data privacy in association rule mining
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
The applicability of the perturbation based privacy preserving data mining for real-world data
Data & Knowledge Engineering
Random orthogonal matrix masking methodology for microdata release
International Journal of Information and Computer Security
A privacy preserving technique for distance-based classification with worst case privacy guarantees
Data & Knowledge Engineering
Privacy-Preserving Data Mining: Models and Algorithms
Privacy-Preserving Data Mining: Models and Algorithms
Journal of Computational Methods in Sciences and Engineering - Computational and Mathematical Methods for Science and Engineering Conference 2002 - CMMSE-2002
Disclosure Risks of Distance Preserving Data Transformations
SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
Knowledge and Information Systems
Never Walk Alone: Uncertainty for Anonymity in Moving Objects Databases
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Secure kNN computation on encrypted databases
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Privacy-preserving data publishing: A survey of recent developments
ACM Computing Surveys (CSUR)
Deriving private information from arbitrarily projected data
PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
Discovering private trajectories using background information
Data & Knowledge Engineering
An attacker's view of distance preserving maps for privacy preserving data mining
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Hi-index | 0.00 |
We examine Euclidean distance-preserving data perturbation as a tool for privacy-preserving data mining. Such perturbations allow many important data mining algorithms (e.g. hierarchical and k-means clustering), with only minor modification, to be applied to the perturbed data and produce exactly the same results as if applied to the original data. However, the issue of how well the privacy of the original data is preserved needs careful study. We engage in this study by assuming the role of an attacker armed with a small set of known original data tuples (inputs). Little work has been done examining this kind of attack when the number of known original tuples is less than the number of data dimensions. We focus on this important case, develop and rigorously analyze an attack that utilizes any number of known original tuples. The approach allows the attacker to estimate the original data tuple associated with each perturbed tuple and calculate the probability that the estimation results in a privacy breach. On a real 16-dimensional dataset, we show that the attacker, with 4 known original tuples, can estimate an original unknown tuple with less than 7% error with probability exceeding 0.8.