L-diversity: Privacy beyond k-anonymity

  • Authors:
  • Ashwin Machanavajjhala;Daniel Kifer;Johannes Gehrke;Muthuramakrishnan Venkitasubramaniam

  • Affiliations:
  • Cornell University, Ithaca, NY;Cornell University, Ithaca, NY;Cornell University, Ithaca, NY;Cornell University, Ithaca, NY

  • Venue:
  • ACM Transactions on Knowledge Discovery from Data (TKDD)
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Publishing data about individuals without revealing sensitive information about them is an important problem. In recent years, a new definition of privacy called k-anonymity has gained popularity. In a k-anonymized dataset, each record is indistinguishable from at least k − 1 other records with respect to certain identifying attributes. In this article, we show using two simple attacks that a k-anonymized dataset has some subtle but severe privacy problems. First, an attacker can discover the values of sensitive attributes when there is little diversity in those sensitive attributes. This is a known problem. Second, attackers often have background knowledge, and we show that k-anonymity does not guarantee privacy against attackers using background knowledge. We give a detailed analysis of these two attacks, and we propose a novel and powerful privacy criterion called ℓ-diversity that can defend against such attacks. In addition to building a formal foundation for ℓ-diversity, we show in an experimental evaluation that ℓ-diversity is practical and can be implemented efficiently.