Privacy via pseudorandom sketches

Authors:
Nina Mishra;Mark Sandler
Affiliations:
University of Virginia;Cornell University
Venue:
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Year:
2006

Citing 13
Cited 16

Secure databases: protection against user influence

ACM Transactions on Database Systems (TODS)
Revealing information while preserving privacy

Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Limiting privacy breaches in privacy preserving data mining

Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
k-anonymity: a model for protecting privacy

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Auditing Boolean attributes

Journal of Computer and System Sciences - Special issue on PODS 2000
Foundations of Cryptography: Volume 2, Basic Applications

Foundations of Cryptography: Volume 2, Basic Applications
Privacy preserving mining of association rules

Information Systems - Knowledge discovery and data mining (KDD 2002)
Simulatable auditing

Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Practical privacy: the SuLQ framework

Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Privacy preserving OLAP

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
\ell -Diversity: Privacy Beyond \kappa -Anonymity

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Anonymizing tables

ICDT'05 Proceedings of the 10th international conference on Database Theory
Toward privacy in public databases

TCC'05 Proceedings of the Second international conference on Theory of Cryptography

Towards robustness in query auditing

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Wherefore art thou r3579x?: anonymized social networks, hidden patterns, and structural steganography

Proceedings of the 16th international conference on World Wide Web
Challenges in mining social network data: processes, privacy, and paradoxes

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
The boundary between privacy and utility in data publishing

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
On static and dynamic methods for condensation-based privacy-preserving data mining

ACM Transactions on Database Systems (TODS)
Workload-aware anonymization techniques for large-scale datasets

ACM Transactions on Database Systems (TODS)
Simulatable Binding: Beyond Simulatable Auditing

SDM '08 Proceedings of the 5th VLDB workshop on Secure Data Management
FRAPP: a framework for high-accuracy privacy-preserving mining

Data Mining and Knowledge Discovery
Private coresets

Proceedings of the forty-first annual ACM symposium on Theory of computing
Wherefore art thou R3579X?: anonymized social networks, hidden patterns, and structural steganography

Communications of the ACM
What Can We Learn Privately?

SIAM Journal on Computing
Bounds on the sample complexity for private learning and private data release

TCC'10 Proceedings of the 7th international conference on Theory of Cryptography
When random sampling preserves privacy

CRYPTO'06 Proceedings of the 26th annual international conference on Advances in Cryptology
Integrating historical noisy answers for improving data utility under differential privacy

Proceedings of the 15th International Conference on Extending Database Technology
Survey: DNA-inspired information concealing: A survey

Computer Science Review
Bounds on the sample complexity for private learning and private data release

Machine Learning

Quantified Score

Hi-index	0.02

Visualization

Abstract

Imagine a collection of individuals who each possess private data that they do not wish to share with a third party. This paper considers how individuals may represent and publish their own data so as to simultaneously preserve their privacy and to ensure that it is possible to extract large-scale statistical behavior from the original unperturbed data. Existing techniques for perturbing data are limited by the number of users required to obtain approximate answers to queries, the richness of preserved statistical behavior, the privacy guarantees given and/or the amount of data that each individual must publish.This paper introduces a new technique to describe parts of an individual's data that is based on pseudorandom sketches. The sketches guarantee that each individual's privacy is provably maintained assuming one of the strongest definitions of privacy that we are aware of: given unlimited computational power and arbitrary partial knowledge, the attacker can not learn any additional private information from the published sketches. However, sketches from multiple users that describe a subset of attributes can be used to estimate the fraction of users that satisfy any conjunction over the full set of negated or unnegated attributes provided that there are enough users. We show that the error of approximation is independent of the number of attributes involved and only depends on the number of users available. An additional benefit is that the size of the sketch is minuscule: [log log O(M)] bits, where M is the number of users. Finally, we show how sketches can be combined to answer more complex queries. An interesting property of our approach is that despite using cryptographic primitives, our privacy guarantees do not rely on any unproven cryptographic conjectures.