Resolving the complexity of some data privacy problems

  • Authors:
  • Jeremiah Blocki;Ryan Williams

  • Affiliations:
  • Carnegie Mellon University;IBM Almaden Research Center

  • Venue:
  • ICALP'10 Proceedings of the 37th international colloquium conference on Automata, languages and programming: Part II
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We formally study two methods for data sanitation that have been used extensively in the database community: k-anonymity and l- diversity. We settle several open problems concerning the difficulty of applying these methods optimally, proving both positive and negative results: - 2-anonymity is in P. - The problem of partitioning the edges of a triangle-free graph into 4-stars (degree-three vertices) is NP-hard. This yields an alternative proof that 3-anonymity is NP-hard even when the database attributes are all binary. - 3-anonymity with only 27 attributes per record is MAX SNP-hard. - For databases with n rows, k-anonymity is in O(4n ċ poly(n)) time for all k 1. - For databases with l attributes, alphabet size c, and n rows, k- Anonymity can be solved in 2O(k2(2c)l) + O(nl) time. - 3-diversity with binary attributes is NP-hard, with one sensitive attribute. - 2-diversity with binary attributes is NP-hard, with three sensitive attributes.