Breaching Euclidean distance-preserving data perturbation using few known inputs

Authors:
Chris R. Giannella;Kun Liu;Hillol Kargupta
Affiliations:
The MITRE Corporation, 300 Sentinel Dr. Suite 600, Annapolis Junction MD 20701, United States;LinkedIn, 2029 Stierlin Court, Mountain View, CA 94043, United States;Dept. of CSEE, University of Maryland Baltimore County, Baltimore, MD 21250, United States and AGNIK LLC, Columbia, MD, United States
Venue:
Data & Knowledge Engineering
Year:
2013

Citing 30
Cited 0

Security-control methods for statistical databases: a comparative study

ACM Computing Surveys (CSUR)
Privacy-preserving data mining

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Independent component analysis: algorithms and applications

Neural Networks
Protecting Respondents' Identities in Microdata Release

IEEE Transactions on Knowledge and Data Engineering
Limiting privacy breaches in privacy preserving data mining

Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
k-anonymity: a model for protecting privacy

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
On the Privacy Preserving Properties of Random Data Perturbation Techniques

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Association Rule Hiding

IEEE Transactions on Knowledge and Data Engineering
Deriving private information from randomized data

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Random Projection-Based Multiplicative Data Perturbation for Privacy Preserving Distributed Data Mining

IEEE Transactions on Knowledge and Data Engineering
Privacy Preserving Data Classification with Rotation Perturbation

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Workload-aware anonymization

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Anonymizing sequential releases

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
A privacy-preserving technique for Euclidean distance-based mining algorithms using Fourier-related transforms

The VLDB Journal — The International Journal on Very Large Data Bases
A Tree-Based Data Perturbation Approach for Privacy-Preserving Data Mining

IEEE Transactions on Knowledge and Data Engineering
Data ShufflingA New Masking Approach for Numerical Data

Management Science
Maintaining data privacy in association rule mining

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
The applicability of the perturbation based privacy preserving data mining for real-world data

Data & Knowledge Engineering
Random orthogonal matrix masking methodology for microdata release

International Journal of Information and Computer Security
A privacy preserving technique for distance-based classification with worst case privacy guarantees

Data & Knowledge Engineering
Privacy-Preserving Data Mining: Models and Algorithms

Privacy-Preserving Data Mining: Models and Algorithms
Additive noise and multiplicative bias as disclosure limitation techniques for continuous microdata: A simulation study

Journal of Computational Methods in Sciences and Engineering - Computational and Mathematical Methods for Science and Engineering Conference 2002 - CMMSE-2002
Disclosure Risks of Distance Preserving Data Transformations

SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
Determining error bounds for spectral filtering based reconstruction methods in privacy preserving data mining

Knowledge and Information Systems
Never Walk Alone: Uncertainty for Anonymity in Moving Objects Databases

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Secure kNN computation on encrypted databases

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Privacy-preserving data publishing: A survey of recent developments

ACM Computing Surveys (CSUR)
Deriving private information from arbitrarily projected data

PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
Discovering private trajectories using background information

Data & Knowledge Engineering
An attacker's view of distance preserving maps for privacy preserving data mining

PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases

Quantified Score

Hi-index	0.00

Visualization

Abstract

We examine Euclidean distance-preserving data perturbation as a tool for privacy-preserving data mining. Such perturbations allow many important data mining algorithms (e.g. hierarchical and k-means clustering), with only minor modification, to be applied to the perturbed data and produce exactly the same results as if applied to the original data. However, the issue of how well the privacy of the original data is preserved needs careful study. We engage in this study by assuming the role of an attacker armed with a small set of known original data tuples (inputs). Little work has been done examining this kind of attack when the number of known original tuples is less than the number of data dimensions. We focus on this important case, develop and rigorously analyze an attack that utilizes any number of known original tuples. The approach allows the attacker to estimate the original data tuple associated with each perturbed tuple and calculate the probability that the estimation results in a privacy breach. On a real 16-dimensional dataset, we show that the attacker, with 4 known original tuples, can estimate an original unknown tuple with less than 7% error with probability exceeding 0.8.