On the complexity of the l-diversity problem

Authors:
Riccardo Dondi;Giancarlo Mauri;Italo Zoppis
Affiliations:
Dipartimento di Scienze dei Linguaggi, Università degli Studi di Bergamo, Bergamo, Italy;DISCo, Università degli Studi di Milano-Bicocca, Milano, Italy;DISCo, Università degli Studi di Milano-Bicocca, Milano, Italy
Venue:
MFCS'11 Proceedings of the 36th international conference on Mathematical foundations of computer science
Year:
2011

Citing 17
Cited 1

Fixed-parameter tractability and completeness II: on completeness for W[1]

Theoretical Computer Science
A sub-constant error-probability low-degree test, and a sub-constant error-probability PCP characterization of NP

STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
Some APX-completeness results for cubic graphs

Theoretical Computer Science
Approximation algorithms

Approximation algorithms
Protecting Respondents' Identities in Microdata Release

IEEE Transactions on Knowledge and Data Engineering
k-anonymity: a model for protecting privacy

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
On the complexity of optimal K-anonymity

PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
\ell -Diversity: Privacy Beyond \kappa -Anonymity

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Algorithmic construction of sets for k-restrictions

ACM Transactions on Algorithms (TALG)
Approximate algorithms for K-anonymity

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
k-Anonymization with Minimal Loss of Information

IEEE Transactions on Knowledge and Data Engineering
The hardness and approximation algorithms for l-diversity

Proceedings of the 13th International Conference on Extending Database Technology
Clustering with diversity

ICALP'10 Proceedings of the 37th international colloquium conference on Automata, languages and programming
Resolving the complexity of some data privacy problems

ICALP'10 Proceedings of the 37th international colloquium conference on Automata, languages and programming: Part II
Parameterized complexity of k-anonymity: hardness and tractability

IWOCA'10 Proceedings of the 21st international conference on Combinatorial algorithms
Anonymizing binary and small tables is hard to approximate

Journal of Combinatorial Optimization
Anonymizing tables

ICDT'05 Proceedings of the 10th international conference on Database Theory

The l-Diversity problem: Tractability and approximability

Theoretical Computer Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

The problem of publishing personal data without giving up privacy is becoming increasingly important. Different interesting formalizations have been recently proposed in this context, i.e. k-anonymity [17,18] and l-diversity [12]. These approaches require that the rows in a table are clustered in sets satisfying some constraint, in order to prevent the identification of the individuals the rows belong to. In this paper we focus on the l-diversity problem, where the possible attributes are distinguished in sensible attributes and quasi-identifier attributes. The goal is to partition the set of rows, where for each set C of the partition it is required that the number of rows having a specific value in the sensible attribute is at most 1/l |C|. We investigate the approximation and parameterized complexity of l-diversity. Concerning the approximation complexity, we prove the following results: (1) the problem is not approximable within factor c ln l, for some constant c 0, even if the input table consists of two columns; (ii) the problem is APX-hard, even if l = 4 and the input table contains exactly 3 columns; (iii) the problem admits an approximation algorithm of factor m (where m + 1 is the number of columns in the input table), when the sensitive attribute ranges over an alphabet of constant size. Concerning the parameterized complexity, we prove the following results: (i) the problem is W[1]-hard even if parameterized by the size of the solution, l, and the size of the alphabet; (ii) the problem admits a fixed-parameter algorithm when both the maximum number of different values in a column and the number of columns are parameters.