On the Anonymization of Sparse High-Dimensional Data

Authors:
Gabriel Ghinita;Yufei Tao;Panos Kalnis
Affiliations:
Department of Computer Science, National University of Singapore, Computing 1, Singapore 117590. ghinitag@comp.nus.edu.sg;Department of Computer Science and Engineering, Chinese University of Hong Kong, Sha Tin, New Territories, Hong Kong SAR, China. taoyf@cse.cuhk.edu.hk;Department of Computer Science, National University of Singapore, Computing 1, Singapore 117590. kalnis@comp.nus.edu.sg
Venue:
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Year:
2008

Citing 0
Cited 36

Privacy-preserving anonymization of set-valued data

Proceedings of the VLDB Endowment
Anonymizing bipartite graph data using safe groupings

Proceedings of the VLDB Endowment
Privacy protection for RFID data

Proceedings of the 2009 ACM symposium on Applied Computing
Anonymizing healthcare data: a case study on the blood transfusion service

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Anonymizing location-based RFID data

C3S2E '09 Proceedings of the 2nd Canadian Conference on Computer Science and Software Engineering
Privacy-Preserving Data Publishing

Foundations and Trends in Databases
Anonymization of set-valued data via top-down, local generalization

Proceedings of the VLDB Endowment
k-automorphism: a general framework for privacy preserving network publication

Proceedings of the VLDB Endowment
Anonymizing bipartite graph data using safe groupings

The VLDB Journal — The International Journal on Very Large Data Bases
Algorithm-safe privacy-preserving data publishing

Proceedings of the 13th International Conference on Extending Database Technology
Privacy-preserving data publishing: A survey of recent developments

ACM Computing Surveys (CSUR)
Towards publishing recommendation data with predictive anonymization

ASIACCS '10 Proceedings of the 5th ACM Symposium on Information, Computer and Communications Security
Centralized and Distributed Anonymization for High-Dimensional Healthcare Data

ACM Transactions on Knowledge Discovery from Data (TKDD)
Anonymizing transaction data to eliminate sensitive inferences

DEXA'10 Proceedings of the 21st international conference on Database and expert systems applications: Part I
Small domain randomization: same privacy, more utility

Proceedings of the VLDB Endowment
ρ-uncertainty: inference-proof transaction anonymization

Proceedings of the VLDB Endowment
Extended k-anonymity models against sensitive attribute disclosure

Computer Communications
Local and global recoding methods for anonymizing set-valued data

The VLDB Journal — The International Journal on Very Large Data Bases
PCTA: privacy-constrained clustering-based transaction data anonymization

Proceedings of the 4th International Workshop on Privacy and Anonymity in the Information Society
ASAP: Eliminating algorithm-based disclosure in privacy-preserving data publishing

Information Systems
C-safety: a framework for the anonymization of semantic trajectories

Transactions on Data Privacy
Publishing anonymous survey rating data

Data Mining and Knowledge Discovery
A publication process model to enable privacy-aware data sharing

IBM Journal of Research and Development
Anonymizing transaction data by integrating suppression and generalization

PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Satisfying privacy requirements: one step before anonymization

PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Utility-preserving transaction data anonymization with low information loss

Expert Systems with Applications: An International Journal
Utility-guided Clustering-based Transaction Data Anonymization

Transactions on Data Privacy
On the identity anonymization of high-dimensional rating data

Concurrency and Computation: Practice & Experience
Privacy preservation by disassociation

Proceedings of the VLDB Endowment
Anonymizing set-valued data by nonreciprocal recoding

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
PrivBasis: frequent itemset mining with differential privacy

Proceedings of the VLDB Endowment
Clustering-oriented privacy-preserving data publishing

Knowledge-Based Systems
Protecting User Privacy Better with Query l-Diversity

International Journal of Information Security and Privacy
Privacy-preserving trajectory data publishing by local suppression

Information Sciences: an International Journal
A new tool for sharing and querying of clinical documents modeled using HL7 Version 3 standard

Computer Methods and Programs in Biomedicine
A general framework for privacy preserving data publishing

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Existing research on privacy-preserving data publishing focuses on relational data: in this context, the objective is to enforce privacy-preserving paradigms, such as k-anonymity and l-diversity, while minimizing the information loss incurred in the anonymizing process (i.e. maximize data utility). However, existing techniques adopt an indexing-or clustering-based approach, and work well for fixed-schema data, with low dimensionality. Nevertheless, certain applications require privacy-preserving publishing of transaction data (or basket data), which involves hundreds or even thousands of dimensions, rendering existing methods unusable. We propose a novel anonymization method for sparse high-dimensional data. We employ a particular representation that captures the correlation in the underlying data, and facilitates the formation of anonymized groups with low information loss. We propose an efficient anonymization algorithm based on this representation. We show experimentally, using real-life datasets, that our method clearly outperforms existing state-of-the-art in terms of both data utility and computational overhead.