Fast data anonymization with low information loss

Authors:
Gabriel Ghinita;Panagiotis Karras;Panos Kalnis;Nikos Mamoulis
Affiliations:
National University of Singapor;University of Hong Kong;National University of Singapor;University of Hong Kong
Venue:
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Year:
2007

Citing 18
Cited 47

Implementing data cubes efficiently

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Generalizing data to provide anonymity when disclosing information (abstract)

PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
An Algorithm for Finding Best Matches in Logarithmic Expected Time

ACM Transactions on Mathematical Software (TOMS)
Analysis of the Clustering Properties of the Hilbert Space-Filling Curve

IEEE Transactions on Knowledge and Data Engineering
k-anonymity: a model for protecting privacy

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Transforming data to satisfy privacy constraints

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Data Privacy through Optimal k-Anonymization

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
On the complexity of optimal K-anonymity

PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Incognito: efficient full-domain K-anonymity

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Generalized multidimensional data mapping and query processing

ACM Transactions on Database Systems (TODS)
Mondrian Multidimensional K-Anonymity

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
\ell -Diversity: Privacy Beyond \kappa -Anonymity

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Achieving anonymity via clustering

Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Injecting utility into anonymized datasets

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Personalized privacy preservation

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Workload-aware anonymization

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Utility-based anonymization using local recoding

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Anatomy: simple and effective privacy preservation

VLDB '06 Proceedings of the 32nd international conference on Very large data bases

Preventing Location-Based Identity Inference in Anonymous Spatial Queries

IEEE Transactions on Knowledge and Data Engineering
Dynamic anonymization: accurate statistical analysis with privacy preservation

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Preservation of proximity privacy in publishing numerical sensitive data

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Protecting privacy in recorded conversations

PAIS '08 Proceedings of the 2008 international workshop on Privacy and anonymity in information society
Privacy-Preserving Publication of User Locations in the Proximity of Sensitive Sites

SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
Generalization-Based Privacy-Preserving Data Collection

DaWaK '08 Proceedings of the 10th international conference on Data Warehousing and Knowledge Discovery
Privacy-preserving anonymization of set-valued data

Proceedings of the VLDB Endowment
Does enforcing anonymity mean decreasing data usefulness?

Proceedings of the 4th ACM workshop on Quality of protection
Continuous privacy preserving publishing of data streams

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
A framework for efficient data anonymization under privacy and accuracy constraints

ACM Transactions on Database Systems (TODS)
Attacks on privacy and deFinetti's theorem

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Location Diversity: Enhanced Privacy Protection in Location Based Services

LoCA '09 Proceedings of the 4th International Symposium on Location and Context Awareness
Data and Structural k-Anonymity in Social Networks

Privacy, Security, and Trust in KDD
Clustering-Based Frequency l-Diversity Anonymization

ISA '09 Proceedings of the 3rd International Conference and Workshops on Advances in Information Security and Assurance
Privacy-Preserving Data Publishing

Foundations and Trends in Databases
TIAMAT: a tool for interactive analysis of microdata anonymization techniques

Proceedings of the VLDB Endowment
A reciprocal framework for spatial K-anonymity

Information Systems
COP: privacy-preserving multidimensional partition in DAS paradigm

Proceedings of the 2009 EDBT/ICDT Workshops
Transparent anonymization: Thwarting adversaries who know the algorithm

ACM Transactions on Database Systems (TODS)
The hardness and approximation algorithms for l-diversity

Proceedings of the 13th International Conference on Extending Database Technology
Reducing metadata complexity for faster table summarization

Proceedings of the 13th International Conference on Extending Database Technology
Algorithm-safe privacy-preserving data publishing

Proceedings of the 13th International Conference on Extending Database Technology
On the use of economic price theory to find the optimum levels of privacy and information utility in non-perturbative microdata anonymisation

Data & Knowledge Engineering
Non-homogeneous generalization in privacy preserving data publishing

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Performance study of active tracking in a cellular network using a modular signaling platform

Proceedings of the 8th international conference on Mobile systems, applications, and services
P-Sensitive K-Anonymity with Generalization Constraints

Transactions on Data Privacy
Enabling search services on outsourced private spatial data

The VLDB Journal — The International Journal on Very Large Data Bases
Versatile publishing for privacy preservation

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering with diversity

ICALP'10 Proceedings of the 37th international colloquium conference on Automata, languages and programming
Extending l-diversity to generalize sensitive data

Data & Knowledge Engineering
Small domain randomization: same privacy, more utility

Proceedings of the VLDB Endowment
Preventing range disclosure in k-anonymised data

Expert Systems with Applications: An International Journal
Instant anonymization

ACM Transactions on Database Systems (TODS)
Local and global recoding methods for anonymizing set-valued data

The VLDB Journal — The International Journal on Very Large Data Bases
SABRE: a Sensitive Attribute Bucketization and REdistribution framework for t-closeness

The VLDB Journal — The International Journal on Very Large Data Bases
Privacy-preserving publishing microdata with full functional dependencies

Data & Knowledge Engineering
Privacy-preserving data sharing in cloud computing

Journal of Computer Science and Technology
ASAP: Eliminating algorithm-based disclosure in privacy-preserving data publishing

Information Systems
Distributed privacy preserving data collection

DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications - Volume Part I
On t-closeness with KL-divergence and semantic privacy

DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part II
Clustering-Based k-anonymity

PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Detecting dependencies in an anonymized dataset

Proceedings of the International Conference on Advances in Computing, Communications and Informatics
A Knowledge Model Sharing Based Approach to Privacy-Preserving Data Mining

Transactions on Data Privacy
Optimal univariate microaggregation with data suppression

Journal of Systems and Software
Protecting User Privacy Better with Query l-Diversity

International Journal of Information Security and Privacy
Class-Restricted Clustering and Microperturbation for Data Privacy

Management Science
Efficient Time-Stamped Event Sequence Anonymization

ACM Transactions on the Web (TWEB)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent research studied the problem of publishing microdata without revealing sensitive information, leading to the privacy preserving paradigms of k-anonymity and l-diversity. k-anonymity protects against the identification of an individual's record. l-diversity, in addition, safeguards against the association of an individual with specific sensitive information. However, existing approaches suffer from at least one of the following drawbacks: (i) The information loss metrics are counter-intuitive and fail to capture data inaccuracies inflicted for the sake of privacy. (ii) l-diversity is solved by techniques developed for the simpler k-anonymity problem, which introduces unnecessary inaccuracies. (iii) The anonymization process is inefficient in terms of computation and I/O cost. In this paper we propose a framework for efficient privacy preservation that addresses these deficiencies. First, we focus on one-dimensional (i.e., single attribute) quasi-identifiers, and study the properties of optimal solutions for k-anonymity and l-diversity, based on meaningful information loss metrics. Guided by these properties, we develop efficient heuristics to solve the one-dimensional problems in linear time. Finally, we generalize our solutions to multi-dimensional quasi-identifiers using space-mapping techniques. Extensive experimental evaluation shows that our techniques clearly outperform the state-of-the-art, in terms of execution time and information loss.