Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator
ACM Transactions on Modeling and Computer Simulation (TOMACS) - Special issue on uniform random number generation
Privacy-preserving data mining
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Protecting Respondents' Identities in Microdata Release
IEEE Transactions on Knowledge and Data Engineering
Limiting privacy breaches in privacy preserving data mining
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Privacy preserving mining of association rules
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Using randomized response techniques for privacy-preserving data mining
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
A Framework for High-Accuracy Privacy-Preserving Mining
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Deriving private information from randomized data
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Incognito: efficient full-domain K-anonymity
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
\ell -Diversity: Privacy Beyond \kappa -Anonymity
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Injecting utility into anonymized datasets
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Anonymizing sequential releases
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Anatomy: simple and effective privacy preservation
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
M-invariance: towards privacy preserving re-publication of dynamic datasets
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Privacy, accuracy, and consistency too: a holistic solution to contingency table release
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Maintaining data privacy in association rule mining
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
The boundary between privacy and utility in data publishing
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Minimality attack in privacy preserving data publishing
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Mechanism Design via Differential Privacy
FOCS '07 Proceedings of the 48th Annual IEEE Symposium on Foundations of Computer Science
A learning theory approach to non-interactive database privacy
STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
Composition attacks and auxiliary information in data privacy
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
OptRR: Optimizing Randomized Response Schemes for Privacy-Preserving Data Mining
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
On Anti-Corruption Privacy Preserving Publication
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Privacy: Theory meets Practice on the Map
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Relationship privacy: output perturbation for queries with joins
Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Attacks on privacy and deFinetti's theorem
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
ICALP'06 Proceedings of the 33rd international conference on Automata, Languages and Programming - Volume Part II
Secure anonymization for incremental datasets
SDM'06 Proceedings of the Third VLDB international conference on Secure Data Management
Calibrating noise to sensitivity in private data analysis
TCC'06 Proceedings of the Third conference on Theory of Cryptography
Versatile publishing for privacy preservation
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Small domain randomization: same privacy, more utility
Proceedings of the VLDB Endowment
iReduct: differential privacy with reduced relative errors
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Hi-index | 0.03 |
Random perturbation is a popular method of computing anonymized data for privacy preserving data mining. It is simple to apply, ensures strong privacy protection, and permits effective mining of a large variety of data patterns. However, all the existing studies with good privacy guarantees focus on perturbation at a single privacy level. Namely, a fixed degree of privacy protection is imposed on all anonymized data released by the data holder. This drawback seriously limits the applicability of random perturbation in scenarios where the holder has numerous recipients to which different privacy levels apply. Motivated by this, we study the problem of multi-level perturbation, whose objective is to release multiple versions of a dataset anonymized at different privacy levels. The challenge is that various recipients may collude by sharing their data to infer privacy beyond their permitted levels. Our solution overcomes this obstacle, and achieves two crucial properties. First, collusion is useless, meaning that the colluding recipients cannot learn anything more than what the most trustable recipient (among the colluding recipients) already knows alone. Second, the data each recipient receives can be regarded (and hence, analyzed in the same way) as the output of conventional uniform perturbation. Besides its solid theoretical foundation, the proposed technique is both space economical and computationally efficient. It requires O (n+m) expected space, and produces a new anonymized version in O (n + log m) expected time, where n is the cardinality of the original dataset, and m the number of versions released previously. Both bounds are optimal under the realistic assumption that n » m.