Generalizing data to provide anonymity when disclosing information (abstract)
PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
The Earth Mover's Distance as a Metric for Image Retrieval
International Journal of Computer Vision
Analysis of the Clustering Properties of the Hilbert Space-Filling Curve
IEEE Transactions on Knowledge and Data Engineering
Protecting Respondents' Identities in Microdata Release
IEEE Transactions on Knowledge and Data Engineering
Transforming data to satisfy privacy constraints
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Top-Down Specialization for Information and Privacy Preservation
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Data Privacy through Optimal k-Anonymization
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
On the complexity of optimal K-anonymity
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Incognito: efficient full-domain K-anonymity
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Mondrian Multidimensional K-Anonymity
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
\ell -Diversity: Privacy Beyond \kappa -Anonymity
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Achieving anonymity via clustering
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Anonymizing sequential releases
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Utility-based anonymization using local recoding
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Anatomy: simple and effective privacy preservation
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
M-invariance: towards privacy preserving re-publication of dynamic datasets
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Maintaining K-Anonymity against Incremental Updates
SSDBM '07 Proceedings of the 19th International Conference on Scientific and Statistical Database Management
Fast data anonymization with low information loss
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Anonymity for continuous data publishing
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Dynamic anonymization: accurate statistical analysis with privacy preservation
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Preservation of proximity privacy in publishing numerical sensitive data
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
A framework for efficient data anonymization under privacy and accuracy constraints
ACM Transactions on Database Systems (TODS)
CASTLE: A delay-constrained scheme for ks-anonymizing data streams
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Closeness: A New Privacy Measure for Data Publishing
IEEE Transactions on Knowledge and Data Engineering
From t-Closeness-Like Privacy to Postrandomization via Information Theory
IEEE Transactions on Knowledge and Data Engineering
CASTLE: Continuously Anonymizing Data Streams
IEEE Transactions on Dependable and Secure Computing
CASTLE: Continuously Anonymizing Data Streams
IEEE Transactions on Dependable and Secure Computing
ICDT'05 Proceedings of the 10th international conference on Database Theory
Secure anonymization for incremental datasets
SDM'06 Proceedings of the Third VLDB international conference on Secure Data Management
Cloning for privacy protection in multiple independent data publications
Proceedings of the 20th ACM international conference on Information and knowledge management
Utility-driven anonymization in data publishing
Proceedings of the 20th ACM international conference on Information and knowledge management
Limiting disclosure of sensitive data in sequential releases of databases
Information Sciences: an International Journal
Publishing microdata with a robust privacy guarantee
Proceedings of the VLDB Endowment
Efficient tree pattern queries on encrypted XML documents
Proceedings of the Joint EDBT/ICDT 2013 Workshops
Hi-index | 0.00 |
Today, the publication of microdata poses a privacy threat: anonymous personal records can be re-identified using third data sources. Past research has tried to develop a concept of privacy guarantee that an anonymized data set should satisfy before publication, culminating in the notion of t-closeness. To satisfy t-closeness, the records in a data set need to be grouped into Equivalence Classes (ECs), such that each EC contains records of indistinguishable quasi-identifier values, and its local distribution of sensitive attribute (SA) values conforms to the global table distribution of SA values. However, despite this progress, previous research has not offered an anonymization algorithm tailored for t-closeness. In this paper, we cover this gap with SABRE, a SA Bucketization and REdistribution framework for t-closeness. SABRE first greedily partitions a table into buckets of similar SA values and then redistributes the tuples of each bucket into dynamically determined ECs. This approach is facilitated by a property of the Earth Mover's Distance (EMD) that we employ as a measure of distribution closeness: If the tuples in an EC are picked proportionally to the sizes of the buckets they hail from, then the EMD of that EC is tightly upper-bounded using localized upper bounds derived for each bucket. We prove that if the t-closeness constraint is properly obeyed during partitioning, then it is obeyed by the derived ECs too. We develop two instantiations of SABRE and extend it to a streaming environment. Our extensive experimental evaluation demonstrates that SABRE achieves information quality superior to schemes that merely applied algorithms tailored for other models to t-closeness, and can be much faster as well.