A practice-oriented framework for measuring privacy and utility in data sanitization systems

  • Authors:
  • Michal Sramka;Reihaneh Safavi-Naini;Jörg Denzinger;Mina Askari

  • Affiliations:
  • Universitat Rovira i Virgili, Tarragona, Spain;University of Calgary, Calgary, Alberta, Canada;University of Calgary, Calgary, Alberta, Canada;University of Calgary, Calgary, Alberta, Canada

  • Venue:
  • Proceedings of the 2010 EDBT/ICDT Workshops
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Published data is prone to privacy attacks. Sanitization methods aim to prevent these attacks while maintaining usefulness of the data for legitimate users. Quantifying the trade-off between usefulness and privacy of published data has been the subject of much research in recent years. We propose a pragmatic framework for evaluating sanitization systems in real-life and use data mining utility as a universal measure of usefulness and privacy. We propose a definition for data mining utility that can be tuned to capture the needs of data users and the adversaries' intentions in a setting that is specified by a database, a candidate sanitization method, and privacy and utility concerns of data owner. We use this framework to evaluate and compare privacy and utility offered by two well-known sanitization methods, namely k-anonymity and ε-differential privacy, when UCI's "Adult" dataset and the Weka data mining package is used, and utility and privacy measures are defined for users and adversaries. In the case of k-anonymity, we compare our results with the recent work of Brickell and Shmatikov (KDD 2008), and show that using data mining algorithms increases their proposed adversarial gains.