Micro-aggregation-based heuristics for p-sensitive k-anonymity: one step beyond

  • Authors:
  • Agusti Solanas;Francesc Sebé;Josep Domingo-Ferrer

  • Affiliations:
  • Rovira i Virgili University, Tarragona, Catalonia, Spain;Rovira i Virgili University, Tarragona, Catalonia, Spain;Rovira i Virgili University, Tarragona, Catalonia, Spain

  • Venue:
  • PAIS '08 Proceedings of the 2008 international workshop on Privacy and anonymity in information society
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Micro-data protection is a hot topic in the field of Statistical Disclosure Control (SDC), that has gained special interest after the disclosure of 658000 queries by the AOL search engine in August 2006. Many algorithms, methods and properties have been proposed to deal with micro-data disclosure, p-Sensitive k-anonymity has been recently defined as a sophistication of k-anonymity. This new property requires that there be at least p different values for each confidential attribute within the records sharing a combination of key attributes. Like k-anonymity, the algorithm originally proposed to achieve this property was based on generalisations and suppressions; when data sets are numerical this has several data utility problems, namely turning numerical key attributes into categorical, injecting new categories, injecting missing data, and so on. In this article, we recall the foundational concepts of micro-aggregation, k-anonymity and p-sensitive k-anonymity. We show that k-anonymity and p-sensitive k-anonymity can be achieved in numerical data sets by means of micro-aggregation heuristics properly adapted to deal with this task. In addition, we present and evaluate two heuristics for p-sensitive k-anonymity which, being based on micro-aggregation, overcome most of the drawbacks resulting from the generalisation and suppression method.