Approximate algorithms with generalizing attribute values for k-anonymity

Authors:
Hyoungmin Park;Kyuseok Shim
Affiliations:
School of Electrical Engineering and Computer Science, Seoul National University, Kwanak, P.O. Box 34, Seoul, Republic of Korea;School of Electrical Engineering and Computer Science, Seoul National University, Kwanak, P.O. Box 34, Seoul, Republic of Korea
Venue:
Information Systems
Year:
2010

Citing 15
Cited 2

Generalizing data to provide anonymity when disclosing information (abstract)

PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Introduction to Algorithms

Introduction to Algorithms
Discovering Frequent Closed Itemsets for Association Rules

ICDT '99 Proceedings of the 7th International Conference on Database Theory
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
k-anonymity: a model for protecting privacy

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Data Privacy through Optimal k-Anonymization

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
On the complexity of optimal K-anonymity

PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Incognito: efficient full-domain K-anonymity

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
On k-anonymity and the curse of dimensionality

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Mondrian Multidimensional K-Anonymity

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
\ell -Diversity: Privacy Beyond \kappa -Anonymity

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Approximation algorithms for combinatorial problems

Journal of Computer and System Sciences
Using concept taxonomies for effective tree induction

CIS'05 Proceedings of the 2005 international conference on Computational Intelligence and Security - Volume Part II
Anonymizing tables

ICDT'05 Proceedings of the 10th international conference on Database Theory

Privacy and utility for defect prediction: experiments with MORPH

Proceedings of the 34th International Conference on Software Engineering
Parameterized complexity of k-anonymity: hardness and tractability

Journal of Combinatorial Optimization

Quantified Score

Hi-index	0.00

Visualization

Abstract

When a table containing individual data is published, disclosure of sensitive information should be prohibitive. Since simply removing identifiers such as name and social security number may reveal the sensitive information by linking attacks which joins the published table with other tables on some attributes, the notion of k-anonymity which makes each record in the table be indistinguishable with k-1 other records by suppression or generalization has been proposed previously. It is shown to be NP-hard to k-anonymize a table minimizing information loss. The approximation algorithms with up to O(k) approximation ratio were proposed when generalization is used for anonymization. In this paper, we propose several approximation algorithms for k-anonymity with generalizing the attribute values by hierarchies that guarantee O(logk) approximation ratio and perform significantly better than the traditional algorithms. Since suppression of attributes is a special case of generalization of attributes with the hierarchies of two-level trees where the root nodes are '*' character, our approximation result works also for suppression methods. We next provide O(@blogk) approximate algorithms which gracefully adjust their running time according to the tolerance @b(=1) of the approximation ratios. We also present the approximate algorithms for both k-anonymity and @?-diversity with generalizing the attribute values by hierarchies. Experimental results confirm that our approximation algorithms perform significantly better than traditional approximation algorithms.