Generically extending anonymization algorithms to deal with successive queries

  • Authors:
  • Manuel Barbosa;Alexandre Pinto;Bruno Gomes

  • Affiliations:
  • HASLab-INESC TEC & Universidade do Minho, Braga, Portugal;HASLab-INESC TEC & Instituto Superior da Maia, Maia, Portugal;HASLab-INESC TEC & Universidade do Minho, Braga, Portugal

  • Venue:
  • Proceedings of the 21st ACM international conference on Information and knowledge management
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper addresses the scenario of multi-release anonymization of datasets. We consider dynamic datasets where data can be inserted and deleted, and view this scenario as a case where each release is a small subset of the dataset corresponding, for example, to the results of a query. Compared to multiple releases of the full database, this has the obvious advantage of faster anonymization. We present an algorithm for post-processing anonymized queries that prevents anonymity attacks using multiple released queries. This algorithm can be used with several distinct protection principles and anonymization algorithms, which makes it generic and flexible. We give an experimental evaluation of the algorithm and compare it to $m$-invariance both in terms of efficiency and data quality. To this end, we propose two data quality metrics based on Shannon's entropy, and show that they can be seen as a refinement of existing metrics.