Towards optimal k-anonymization

Authors:
Tiancheng Li;Ninghui Li
Affiliations:
CERIAS and Department of Computer Science, Purdue University, 305 N. University Street, West Lafayette, IN 47907-2107, USA;CERIAS and Department of Computer Science, Purdue University, 305 N. University Street, West Lafayette, IN 47907-2107, USA
Venue:
Data & Knowledge Engineering
Year:
2008

Citing 16
Cited 8

Protecting Respondents' Identities in Microdata Release

IEEE Transactions on Knowledge and Data Engineering
k-anonymity: a model for protecting privacy

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Achieving k-anonymity privacy protection using generalization and suppression

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Transforming data to satisfy privacy constraints

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Bottom-Up Generalization: A Data Mining Solution to Privacy Protection

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Top-Down Specialization for Information and Privacy Preservation

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Data Privacy through Optimal k-Anonymization

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
On the complexity of optimal K-anonymity

PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Incognito: efficient full-domain K-anonymity

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
On k-anonymity and the curse of dimensionality

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Ordinal, Continuous and Heterogeneous k-Anonymity Through Microaggregation

Data Mining and Knowledge Discovery
Mondrian Multidimensional K-Anonymity

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
\ell -Diversity: Privacy Beyond \kappa -Anonymity

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Achieving anonymity via clustering

Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
OPUS: an efficient admissible algorithm for unordered search

Journal of Artificial Intelligence Research
Anonymizing tables

ICDT'05 Proceedings of the 10th international conference on Database Theory

Privacy-preserving incremental data dissemination

Journal of Computer Security - Selected papers from the Third and Fourth Secure Data Management (SDM) workshops
The Role of Ontologies in the Anonymization of Textual Variables

Proceedings of the 2010 conference on Artificial Intelligence Research and Development: Proceedings of the 13th International Conference of the Catalan Association for Artificial Intelligence
Ontology-based anonymization of categorical values

MDAI'10 Proceedings of the 7th international conference on Modeling decisions for artificial intelligence
On the declassification of confidential documents

MDAI'11 Proceedings of the 8th international conference on Modeling decisions for artificial intelligence
Privacy protection of textual attributes through a semantic-based masking method

Information Fusion
Efficient discovery of de-identification policy options through a risk-utility frontier

Proceedings of the third ACM conference on Data and application security and privacy
Fast clustering-based anonymization approaches with time constraints for data streams

Knowledge-Based Systems
MAGE: A semantics retaining K-anonymization method for mixed data

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

When releasing microdata for research purposes, one needs to preserve the privacy of respondents while maximizing data utility. An approach that has been studied extensively in recent years is to use anonymization techniques such as generalization and suppression to ensure that the released data table satisfies the k-anonymity property. A major thread of research in this area aims at developing more flexible generalization schemes and more efficient searching algorithms to find better anonymizations (i.e., those that have less information loss). This paper presents three new generalization schemes that are more flexible than existing schemes. This flexibility can lead to better anonymizations. We present a taxonomy of generalization schemes and discuss their relationship. We present enumeration algorithms and pruning techniques for finding optimal generalizations in the new schemes. Through experiments on real census data, we show that more-flexible generalization schemes produce higher-quality anonymizations and the bottom-up works better for small k values and small number of quasi-identifier attributes than the top-down approach.