Thoughts on k-anonymization

  • Authors:
  • M. Ercan Nergiz;Chris Clifton

  • Affiliations:
  • Department of Computer Sciences, Purdue University, 250 N. University Street, West Lafayette, IN 47907-2066, USA;Department of Computer Sciences, Purdue University, 250 N. University Street, West Lafayette, IN 47907-2066, USA

  • Venue:
  • Data & Knowledge Engineering
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

k-Anonymity is a method for providing privacy protection by ensuring that data cannot be traced to an individual. In a k-anonymous dataset, any identifying information occurs in at least k tuples. To achieve optimal and practical k-anonymity, recently, many different kinds of algorithms with various assumptions and restrictions have been proposed with different metrics to measure quality. This paper evaluates a family of clustering-based algorithms that are more flexible and even attempts to improve precision by ignoring the restrictions of user-defined Domain Generalization Hierarchies. The evaluation of the new approaches with respect to cost metrics shows that metrics may behave differently with different algorithms and may not correlate with some applications' accuracy on output data.