A framework for efficient data anonymization under privacy and accuracy constraints

Authors:
Gabriel Ghinita;Panagiotis Karras;Panos Kalnis;Nikos Mamoulis
Affiliations:
National University of Singapore, Singapore;National University of Singapore, Singapore;National University of Singapore, Singapore;University of Hong Kong, Hong Kong
Venue:
ACM Transactions on Database Systems (TODS)
Year:
2009

Citing 23
Cited 13

Implementing data cubes efficiently

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Analysis of the Clustering Properties of the Hilbert Space-Filling Curve

IEEE Transactions on Knowledge and Data Engineering
Protecting Respondents' Identities in Microdata Release

IEEE Transactions on Knowledge and Data Engineering
k-anonymity: a model for protecting privacy

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Transforming data to satisfy privacy constraints

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Data Privacy through Optimal k-Anonymization

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
On the complexity of optimal K-anonymity

PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Approximation algorithms for array partitioning problems

Journal of Algorithms
Incognito: efficient full-domain K-anonymity

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Generalized multidimensional data mapping and query processing

ACM Transactions on Database Systems (TODS)
Mondrian Multidimensional K-Anonymity

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
\ell -Diversity: Privacy Beyond \kappa -Anonymity

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Achieving anonymity via clustering

Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Injecting utility into anonymized datasets

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Personalized privacy preservation

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Workload-aware anonymization

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Utility-based anonymization using local recoding

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Anatomy: simple and effective privacy preservation

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Approximate algorithms for K-anonymity

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
M-invariance: towards privacy preserving re-publication of dynamic datasets

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
K-anonymization as spatial indexing: toward scalable and incremental anonymization

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Fast data anonymization with low information loss

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Efficient k-anonymization using clustering techniques

DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications

P-Sensitive K-Anonymity with Generalization Constraints

Transactions on Data Privacy
Efficient Anonymizations with Enhanced Utility

Transactions on Data Privacy
ρ-uncertainty: inference-proof transaction anonymization

Proceedings of the VLDB Endowment
SABRE: a Sensitive Attribute Bucketization and REdistribution framework for t-closeness

The VLDB Journal — The International Journal on Very Large Data Bases
No free lunch in data privacy

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
On-the-fly generalization hierarchies for numerical attributes revisited

SDM'11 Proceedings of the 8th VLDB international conference on Secure data management
Utility-driven anonymization in data publishing

Proceedings of the 20th ACM international conference on Information and knowledge management
Secure distributed computation of anonymized views of shared databases

ACM Transactions on Database Systems (TODS)
A practical approximation algorithm for optimal k-anonymity

Data Mining and Knowledge Discovery
Anonymizing set-valued data by nonreciprocal recoding

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Publishing microdata with a robust privacy guarantee

Proceedings of the VLDB Endowment
Fast clustering-based anonymization approaches with time constraints for data streams

Knowledge-Based Systems
Improving accuracy of classification models induced from anonymized datasets

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent research studied the problem of publishing microdata without revealing sensitive information, leading to the privacy-preserving paradigms of k-anonymity and l-diversity. k-anonymity protects against the identification of an individual's record. l-diversity, in addition, safeguards against the association of an individual with specific sensitive information. However, existing approaches suffer from at least one of the following drawbacks: (i) l-diversification is solved by techniques developed for the simpler k-anonymization problem, causing unnecessary information loss. (ii) The anonymization process is inefficient in terms of computational and I/O cost. (iii) Previous research focused exclusively on the privacy-constrained problem and ignored the equally important accuracy-constrained (or dual) anonymization problem. In this article, we propose a framework for efficient anonymization of microdata that addresses these deficiencies. First, we focus on one-dimensional (i.e., single-attribute) quasi-identifiers, and study the properties of optimal solutions under the k-anonymity and l-diversity models for the privacy-constrained (i.e., direct) and the accuracy-constrained (i.e., dual) anonymization problems. Guided by these properties, we develop efficient heuristics to solve the one-dimensional problems in linear time. Finally, we generalize our solutions to multidimensional quasi-identifiers using space-mapping techniques. Extensive experimental evaluation shows that our techniques clearly outperform the existing approaches in terms of execution time and information loss.