Analysis of privacy preserving random perturbation techniques: further explorations

Authors:
Haimonti Dutta;Hillol Kargupta;Souptik Datta;Krishnamoorthy Sivakumar
Affiliations:
University of Maryland, Baltimore County, Baltimore, Maryland;University of Maryland, Baltimore County, Baltimore, Maryland;University of Maryland, Baltimore County, Baltimore, Maryland;Washington State University, Pullman, Washington
Venue:
Proceedings of the 2003 ACM workshop on Privacy in the electronic society
Year:
2003

Citing 11
Cited 0

Mining quantitative association rules in large relational tables

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Privacy-preserving data mining

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
On the design and quantification of privacy preserving data mining algorithms

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Graph-Based Data Mining

IEEE Intelligent Systems
Randomization in privacy preserving data mining

ACM SIGKDD Explorations Newsletter
Privacy preserving mining of association rules

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Privacy preserving association rule mining in vertically partitioned data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
An Algorithmic Theory of Learning: Robust Concepts and Random Projection

FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Building decision tree classifier on private data

CRPIT '14 Proceedings of the IEEE international conference on Privacy, security and data mining - Volume 14
On the Privacy Preserving Properties of Random Data Perturbation Techniques

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
StatStream: statistical monitoring of thousands of data streams in real time

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases

Quantified Score

Hi-index	0.00

Visualization

Abstract

Privacy is becoming an increasingly important issue in many data mining applications, particularly in the security and defense area. This has triggered the development of many privacy-preserving data mining techniques. A large fraction of them uses randomized data distortion techniques to mask the data for preserving the privacy. They attempt to hide the sensitive data by randomly modifying the data values using additive noise. This paper questions the utility of such randomized data distortion technique for preserving privacy in many cases and urges caution. It notes that random objects (particularly random matrices) have "predictable" structures in the spectral domain and then offers a random matrix-based spectral filtering technique to retrieve original data from the data-set distorted by adding random values. It extends our earlier work questioning the efficacy of random perturbation techniques using additive noise for privacy-preserving data mining in continuous valued domain and presents new results in the discrete domain. It shows that the growing collection of random perturbation-based "privacy-preserving" data mining techniques may need a careful scrutiny in order to prevent privacy breaches through linear transformations. The paper also presents extensive experimental results in order to support this claim.