Analysis of privacy preserving random perturbation techniques: further explorations

  • Authors:
  • Haimonti Dutta;Hillol Kargupta;Souptik Datta;Krishnamoorthy Sivakumar

  • Affiliations:
  • University of Maryland, Baltimore County, Baltimore, Maryland;University of Maryland, Baltimore County, Baltimore, Maryland;University of Maryland, Baltimore County, Baltimore, Maryland;Washington State University, Pullman, Washington

  • Venue:
  • Proceedings of the 2003 ACM workshop on Privacy in the electronic society
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Privacy is becoming an increasingly important issue in many data mining applications, particularly in the security and defense area. This has triggered the development of many privacy-preserving data mining techniques. A large fraction of them uses randomized data distortion techniques to mask the data for preserving the privacy. They attempt to hide the sensitive data by randomly modifying the data values using additive noise. This paper questions the utility of such randomized data distortion technique for preserving privacy in many cases and urges caution. It notes that random objects (particularly random matrices) have "predictable" structures in the spectral domain and then offers a random matrix-based spectral filtering technique to retrieve original data from the data-set distorted by adding random values. It extends our earlier work questioning the efficacy of random perturbation techniques using additive noise for privacy-preserving data mining in continuous valued domain and presents new results in the discrete domain. It shows that the growing collection of random perturbation-based "privacy-preserving" data mining techniques may need a careful scrutiny in order to prevent privacy breaches through linear transformations. The paper also presents extensive experimental results in order to support this claim.