Disclosure Risks of Distance Preserving Data Transformations

  • Authors:
  • E. Onur Turgay;Thomas B. Pedersen;Yücel Saygın;Erkay Savaş;Albert Levi

  • Affiliations:
  • Sabanci University, Istanbul, Turkey 34956;Sabanci University, Istanbul, Turkey 34956;Sabanci University, Istanbul, Turkey 34956;Sabanci University, Istanbul, Turkey 34956;Sabanci University, Istanbul, Turkey 34956

  • Venue:
  • SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

One of the fundamental challenges that the data mining community faces today is privacy. The question "How are we going to do data mining without violating the privacy of individuals?" is still on the table, and research is being conducted to find efficient methods to do that. Data transformation was previously proposed as one efficient method for privacy preserving data mining when a party needs to out-source the data mining task, or when distributed data mining needs to be performed among multiple parties without each party disclosing its actual data. In this paper we study the safety of distance preserving data transformations proposed for privacy preserving data mining. We show that an adversary can recover the original data values with very high confidence via knowledge of mutual distances between data objects together with the probability distribution from which they are drawn. Experiments conducted on real and synthetic data sets demonstrate the effectiveness of the theoretical results.