Efficient k-anonymization using clustering techniques

Authors:
Ji-Won Byun;Ashish Kamra;Elisa Bertino;Ninghui Li
Affiliations:
CERIAS and Computer Science, Purdue University;CERIAS and Electrical and Computer Engineering, Purdue University;CERIAS and Computer Science, Purdue University;CERIAS and Computer Science, Purdue University
Venue:
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Year:
2007

Citing 11
Cited 41

Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values

Data Mining and Knowledge Discovery
Protecting Respondents' Identities in Microdata Release

IEEE Transactions on Knowledge and Data Engineering
k-anonymity: a model for protecting privacy

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Achieving k-anonymity privacy protection using generalization and suppression

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Transforming data to satisfy privacy constraints

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Top-Down Specialization for Information and Privacy Preservation

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Data Privacy through Optimal k-Anonymization

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
On the complexity of optimal K-anonymity

PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Incognito: efficient full-domain K-anonymity

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Mondrian Multidimensional K-Anonymity

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Anonymizing tables

ICDT'05 Proceedings of the 10th international conference on Database Theory

Protecting privacy in recorded conversations

PAIS '08 Proceedings of the 2008 international workshop on Privacy and anonymity in information society
Data utility and privacy protection trade-off in k-anonymisation

PAIS '08 Proceedings of the 2008 international workshop on Privacy and anonymity in information society
An efficient clustering method for k-anonymization

PAIS '08 Proceedings of the 2008 international workshop on Privacy and anonymity in information society
A k-Anonymity Clustering Method for Effective Data Privacy Preservation

ADMA '07 Proceedings of the 3rd international conference on Advanced Data Mining and Applications
An Empirical Study of Utility Measures for k-Anonymisation

BNCOD '08 Proceedings of the 25th British national conference on Databases: Sharing Data, Information and Knowledge
An Approach to Evaluate Data Trustworthiness Based on Data Provenance

SDM '08 Proceedings of the 5th VLDB workshop on Secure Data Management
BSGI: An Effective Algorithm towards Stronger l-Diversity

DEXA '08 Proceedings of the 19th international conference on Database and Expert Systems Applications
A random walk on the red carpet: rating movies with user reviews and pagerank

Proceedings of the 17th ACM conference on Information and knowledge management
L-Diversity Based Dynamic Update for Large Time-Evolving Microdata

AI '08 Proceedings of the 21st Australasian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence
Towards trajectory anonymization: a generalization-based approach

SPRINGL '08 Proceedings of the SIGSPATIAL ACM GIS 2008 International Workshop on Security and Privacy in GIS and LBS
Privacy-preserving incremental data dissemination

Journal of Computer Security - Selected papers from the Third and Fourth Secure Data Management (SDM) workshops
Genetic algorithm-based clustering approach for k-anonymization

Expert Systems with Applications: An International Journal
The union-split algorithm and cluster-based anonymization of social networks

Proceedings of the 4th International Symposium on Information, Computer, and Communications Security
A framework for efficient data anonymization under privacy and accuracy constraints

ACM Transactions on Database Systems (TODS)
Towards Trajectory Anonymization: a Generalization-Based Approach

Transactions on Data Privacy
Data and Structural k-Anonymity in Social Networks

Privacy, Security, and Trust in KDD
On the Approximability of Geometric and Geographic Generalization and the Min-Max Bin Covering Problem

WADS '09 Proceedings of the 11th International Symposium on Algorithms and Data Structures
TIAMAT: a tool for interactive analysis of microdata anonymization techniques

Proceedings of the VLDB Endowment
Reducing metadata complexity for faster table summarization

Proceedings of the 13th International Conference on Extending Database Technology
A practice-oriented framework for measuring privacy and utility in data sanitization systems

Proceedings of the 2010 EDBT/ICDT Workshops
Towards publishing recommendation data with predictive anonymization

ASIACCS '10 Proceedings of the 5th ACM Symposium on Information, Computer and Communications Security
Speeding up clustering-based k-anonymisation algorithms with pre-partitioning

BNCOD'07 Proceedings of the 24th British national conference on Databases
User-controlled generalization boundaries for p-sensitive k-anonymity

Proceedings of the 2010 ACM Symposium on Applied Computing
Allowing privacy protection algorithms to jump out of local optimums: an ordered greed framework

PinKDD'07 Proceedings of the 1st ACM SIGKDD international conference on Privacy, security, and trust in KDD
P-Sensitive K-Anonymity with Generalization Constraints

Transactions on Data Privacy
Efficient Anonymizations with Enhanced Utility

Transactions on Data Privacy
Anonymization of moving objects databases by clustering and perturbation

Information Systems
Systematic clustering method for l-diversity model

ADC '10 Proceedings of the Twenty-First Australasian Conference on Database Technologies - Volume 104
How to share your favourite search results while preserving privacy and quality

PETS'10 Proceedings of the 10th international conference on Privacy enhancing technologies
Instant anonymization

ACM Transactions on Database Systems (TODS)
A user-oriented anonymization mechanism for public data

DPM'10/SETOP'10 Proceedings of the 5th international Workshop on data privacy management, and 3rd international conference on Autonomous spontaneous security
PCTA: privacy-constrained clustering-based transaction data anonymization

Proceedings of the 4th International Workshop on Privacy and Anonymity in the Information Society
An efficient clustering algorithm for k-anonymisation

Journal of Computer Science and Technology
Trajectory anonymity in publishing personal mobility data

ACM SIGKDD Explorations Newsletter
Secure distributed computation of anonymized views of shared databases

ACM Transactions on Database Systems (TODS)
A practical approximation algorithm for optimal k-anonymity

Data Mining and Knowledge Discovery
Utility-guided Clustering-based Transaction Data Anonymization

Transactions on Data Privacy
A sensitive attribute based clustering method for k-anonymization

ADCONS'11 Proceedings of the 2011 international conference on Advanced Computing, Networking and Security
An automated data utility clustering methodology using data constraint rules

Proceedings of the 2012 international workshop on Smart health and wellbeing
Clustering-based k-anonymisation algorithms

DEXA'07 Proceedings of the 18th international conference on Database and Expert Systems Applications
Exploring privacy versus data quality trade-offs in anonymization techniques using multi-objective optimization

Journal of Computer Security

Quantified Score

Hi-index	0.00

Visualization

Abstract

k-anonymization techniques have been the focus of intense research in the last few years. An important requirement for such techniques is to ensure anonymization of data while at the same time minimizing the information loss resulting from data modifications. In this paper we propose an approach that uses the idea of clustering to minimize information loss and thus ensure good data quality. The key observation here is that data records that are naturally similar to each other should be part of the same equivalence class. We thus formulate a specific clustering problem, referred to as k-member clustering problem. We prove that this problem is NP-hard and present a greedy heuristic, the complexity of which is in O(n2). As part of our approach we develop a suitable metric to estimate the information loss introduced by generalizations, which works for both numeric and categorical data.