k-Anonymization with Minimal Loss of Information

Authors:
Aristides Gionis;Tamir Tassa
Affiliations:
Yahoo Research, Barcelona;The Open University of Israel, Ra'anana
Venue:
IEEE Transactions on Knowledge and Data Engineering
Year:
2009

Citing 0
Cited 16

Privacy-Preserving Data Publishing

Foundations and Trends in Databases
On the use of economic price theory to find the optimum levels of privacy and information utility in non-perturbative microdata anonymisation

Data & Knowledge Engineering
The k-anonymity problem is hard

FCT'09 Proceedings of the 17th international conference on Fundamentals of computation theory
Efficient Anonymizations with Enhanced Utility

Transactions on Data Privacy
Parameterized complexity of k-anonymity: hardness and tractability

IWOCA'10 Proceedings of the 21st international conference on Combinatorial algorithms
On the complexity of the l-diversity problem

MFCS'11 Proceedings of the 36th international conference on Mathematical foundations of computer science
Finding all maximally-matchable edges in a bipartite graph

Theoretical Computer Science
Limiting disclosure of sensitive data in sequential releases of databases

Information Sciences: an International Journal
Secure distributed computation of anonymized views of shared databases

ACM Transactions on Database Systems (TODS)
A practical approximation algorithm for optimal k-anonymity

Data Mining and Knowledge Discovery
k-Concealment: An Alternative Model of k-Type Anonymity

Transactions on Data Privacy
Towards an automatic construction of Contextual Attribute-Value Taxonomies

Proceedings of the 27th Annual ACM Symposium on Applied Computing
Parameterized complexity of k-anonymity: hardness and tractability

Journal of Combinatorial Optimization
Improving accuracy of classification models induced from anonymized datasets

Information Sciences: an International Journal
The effect of homogeneity on the computational complexity of combinatorial data anonymization

Data Mining and Knowledge Discovery
The l-Diversity problem: Tractability and approximability

Theoretical Computer Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

The technique of k-anonymization allows the releasing of databases that contain personal information while ensuring some degree of individual privacy. Anonymization is usually performed by generalizing database entries. We formally study the concept of generalization, and propose three information-theoretic measures for capturing the amount of information that is lost during the anonymization process. The proposed measures are more general and more accurate than those that were proposed by Meyerson and Williams [23] and Aggarwal et al. [1]. We study the problem of achieving k-anonymity with minimal loss of information. We prove that it is NP-hard and study polynomial approximations for the optimal solution. Our first algorithm gives an approximation guarantee of O(\ln k) for two of our measures as well as for the previously studied measures. This improves the best-known O(k)-approximation in [1]. While the previous approximation algorithms relied on the graph representation framework, our algorithm relies on a novel hypergraph representation that enables the improvement in the approximation ratio from O(k) to O(\ln k). As the running time of the algorithm is O(n^{2k}), we also show how to adapt the algorithm in [1] in order to obtain an O(k)-approximation algorithm that is polynomial in both n and k.