POkA: identifying pareto-optimal k-anonymous nodes in a domain hierarchy lattice

Authors:
Rinku Dewri;Indrajit Ray;Indrakshi Ray;Darrell Whitley
Affiliations:
Colorado State University, Fort Collins, CO, USA;Colorado State University, Fort Collins, CO, USA;Colorado State University, Fort Collins, CO, USA;Colorado State University, Fort Collins, CO, USA
Venue:
Proceedings of the 18th ACM conference on Information and knowledge management
Year:
2009

Citing 15
Cited 0

Generalizing data to provide anonymity when disclosing information (abstract)

PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Protecting Respondents' Identities in Microdata Release

IEEE Transactions on Knowledge and Data Engineering
k-anonymity: a model for protecting privacy

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Achieving k-anonymity privacy protection using generalization and suppression

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Transforming data to satisfy privacy constraints

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Bottom-Up Generalization: A Data Mining Solution to Privacy Protection

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Top-Down Specialization for Information and Privacy Preservation

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Data Privacy through Optimal k-Anonymization

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
On the complexity of optimal K-anonymity

PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Incognito: efficient full-domain K-anonymity

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Mondrian Multidimensional K-Anonymity

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Revisiting the uniqueness of simple demographics in the US population

Proceedings of the 5th ACM workshop on Privacy in electronic society
Capturing data usefulness and privacy protection in K-anonymisation

Proceedings of the 2007 ACM symposium on Applied computing
OptRR: Optimizing Randomized Response Schemes for Privacy-Preserving Data Mining

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
On the Optimal Selection of k in the k-Anonymity Problem

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data generalization is widely used to protect identities and prevent inference of sensitive information during the public release of microdata. The k-anonymity model has been extensively applied in this context. The model seeks a generalization scheme such that every individual becomes indistinguishable from at least k-1 other individuals and the loss in information while doing so is kept at a minimum. The search is performed on a domain hierarchy lattice where every node is a vector signifying the level of generalization for each attribute. An effort to understand privacy and data utility trade-offs will require knowing the minimum possible information losses of every possible value of k. However, this can easily lead to an exhaustive evaluation of all nodes in the hierarchy lattice. In this paper, we propose using the concept of Pareto-optimality to obtain the desired trade-off information. A Pareto-optimal generalization is one in which no other generalization can provide a higher value of k without increasing the information loss. We introduce the Pareto-Optimal k-Anonymization (POkA) algorithm to traverse the hierarchy lattice and show that the number of node evaluations required to find the Pareto-optimal generalizations can be significantly reduced. Results on a benchmark data set show that the algorithm is capable of identifying all Pareto-optimal nodes by evaluating only 20% of nodes in the lattice.