Extending l-diversity to generalize sensitive data

  • Authors:
  • Hongwei Tian;Weining Zhang

  • Affiliations:
  • -;-

  • Venue:
  • Data & Knowledge Engineering
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Generalization is an important technique for protecting privacy in data dissemination. In the framework of generalization, @?-diversity is a strong notion of privacy. However, since existing @?-diversity measures are defined in terms of the most specific (rather than general) sensitive attribute (SA) values, algorithms based on these measures can have narrow eligible ranges for data that has a heavily skewed distribution of SA values and produce anonymous data that has a low utility. In this paper, we propose a new @?-diversity measure called the functional (@t, @?)-diversity, which extends @?-diversity by using a simple function to constrain frequencies of base SA values that are induced by general SA values. As a result, algorithms based on (@t, @?)-diversity may generalize SA values, thus are much less constrained by skew SA distributions. We show that (@t, @?)-diversity is more flexible and elaborate than existing @?-diversity measures. We present an efficient heuristic algorithm that uses a novel order of quasi-identifier (QI) values to achieve (@t, @?)-diversity. We compare our algorithm with two state-of-the-art algorithms that are based on existing @?-diversity measures. Our preliminary experimental results indicate that our algorithm not only provides a stronger privacy protection but also results in better utility of anonymous data.