The hardness and approximation algorithms for l-diversity

Authors:
Xiaokui Xiao;Ke Yi;Yufei Tao
Affiliations:
Nanyang Technological University, Singapore;Hong Kong University of Science and Technology, Hong Kong;Chinese University of Hong Kong, Hong Kong
Venue:
Proceedings of the 13th International Conference on Extending Database Technology
Year:
2010

Citing 37
Cited 10

Security-control methods for statistical databases: a comparative study

ACM Computing Surveys (CSUR)
An access control model supporting periodicity constraints and temporal reasoning

ACM Transactions on Database Systems (TODS)
Privacy-preserving data mining

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Introduction to Algorithms

Introduction to Algorithms
Protecting Respondents' Identities in Microdata Release

IEEE Transactions on Knowledge and Data Engineering
Datafly: A System for Providing Anonymity in Medical Data

Proceedings of the IFIP TC11 WG11.3 Eleventh International Conference on Database Securty XI: Status and Prospects
Limiting privacy breaches in privacy preserving data mining

Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
k-anonymity: a model for protecting privacy

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Transforming data to satisfy privacy constraints

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Bottom-Up Generalization: A Data Mining Solution to Privacy Protection

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Top-Down Specialization for Information and Privacy Preservation

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Data Privacy through Optimal k-Anonymization

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
On the complexity of optimal K-anonymity

PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Practical privacy: the SuLQ framework

Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Incognito: efficient full-domain K-anonymity

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
On k-anonymity and the curse of dimensionality

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Mondrian Multidimensional K-Anonymity

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Achieving anonymity via clustering

Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Injecting utility into anonymized datasets

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Personalized privacy preservation

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Workload-aware anonymization

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
(α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Utility-based anonymization using local recoding

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
A secure distributed framework for achieving k-anonymity

The VLDB Journal — The International Journal on Very Large Data Bases
Anatomy: simple and effective privacy preservation

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
L-diversity: Privacy beyond k-anonymity

ACM Transactions on Knowledge Discovery from Data (TKDD)
Approximate algorithms for K-anonymity

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Hiding the presence of individuals from shared databases

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
M-invariance: towards privacy preserving re-publication of dynamic datasets

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
The boundary between privacy and utility in data publishing

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Minimality attack in privacy preserving data publishing

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
K-anonymization as spatial indexing: toward scalable and incremental anonymization

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Fast data anonymization with low information loss

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Privacy skyline: privacy with multidimensional adversarial knowledge

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
SPSS 16.0 Base User's Guide

SPSS 16.0 Base User's Guide
SAS/STAT 9.2 User's Guide: Survival Analysis

SAS/STAT 9.2 User's Guide: Survival Analysis
Anonymizing tables

ICDT'05 Proceedings of the 10th international conference on Database Theory

Versatile publishing for privacy preservation

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering with diversity

ICALP'10 Proceedings of the 37th international colloquium conference on Automata, languages and programming
Protecting privacy in data release

Foundations of security analysis and design VI
Trajectory privacy in location-based services and data publication

ACM SIGKDD Explorations Newsletter
On the complexity of the l-diversity problem

MFCS'11 Proceedings of the 36th international conference on Mathematical foundations of computer science
Anonymity meets game theory: secure data integration with malicious participants

The VLDB Journal — The International Journal on Very Large Data Bases
AIM: a new privacy preservation algorithm for incomplete microdata based on anatomy

ICPCA/SWS'12 Proceedings of the 2012 international conference on Pervasive Computing and the Networked World
The hardness of (ε, m)-anonymity

WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
The effect of homogeneity on the computational complexity of combinatorial data anonymization

Data Mining and Knowledge Discovery
The l-Diversity problem: Tractability and approximability

Theoretical Computer Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

The existing solutions to privacy preserving publication can be classified into the theoretical and heuristic categories. The former guarantees provably low information loss, whereas the latter incurs gigantic loss in the worst case, but is shown empirically to perform well on many real inputs. While numerous heuristic algorithms have been developed to satisfy advanced privacy principles such as l-diversity, t-closeness, etc., the theoretical category is currently limited to k-anonymity which is the earliest principle known to have severe vulnerability to privacy attacks. Motivated by this, we present the first theoretical study on l-diversity, a popular principle that is widely adopted in the literature. First, we show that optimal l-diverse generalization is NP-hard even when there are only 3 distinct sensitive values in the microdata. Then, an (l · d)-approximation algorithm is developed, where d is the dimensionality of the underlying dataset. This is the first known algorithm with a non-trivial bound on information loss. Extensive experiments with real datasets validate the effectiveness and efficiency of proposed solution.