Clustering-Based Frequency l-Diversity Anonymization

  • Authors:
  • Mohammad-Reza Zare-Mirakabad;Aman Jantan;Stéphane Bressan

  • Affiliations:
  • School of Computer Sciences, Universiti Sains Malaysia, Malaysia and School of Computing, National University of Singapore, Singapore;School of Computer Sciences, Universiti Sains Malaysia, Malaysia;School of Computing, National University of Singapore, Singapore

  • Venue:
  • ISA '09 Proceedings of the 3rd International Conference and Workshops on Advances in Information Security and Assurance
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Privacy preservation is realized by transforming data into k-anonymous (k-anonymization) and l -diverse (l -diversification) versions while minimizing information loss. Frequency l -diversity is possibly the most practical instance of the generic l -diversity principle for privacy preservation. In this paper, we propose an algorithm for frequency l -diversification. Our primary objective is to minimize information loss. Most studies in privacy preservation have focused on k-anonymization. While simple principles of l -diversification algorithms can be obtained by adapting k-anonymization algorithms it is not straightforward for some other principles. Our algorithm, called Bucket Clustering , adapts k-member Clustering . However, in order to guarantee termination we use hashing and buckets as in the Anatomy algorithm. In order to minimize information loss we choose tuples that minimize information loss during the creation of clusters. We empirically show that our algorithm achieves low information loss with acceptable efficiency.