The boundary between privacy and utility in data publishing

Authors:
Vibhor Rastogi;Dan Suciu;Sungho Hong
Affiliations:
-;-;-
Venue:
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Year:
2007

Citing 17
Cited 53

Practical data-swapping: the first steps

ACM Transactions on Database Systems (TODS)
Privacy-preserving data mining

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Revealing information while preserving privacy

Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Limiting privacy breaches in privacy preserving data mining

Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
k-anonymity: a model for protecting privacy

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Privacy preserving mining of association rules

Information Systems - Knowledge discovery and data mining (KDD 2002)
Practical privacy: the SuLQ framework

Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Incognito: efficient full-domain K-anonymity

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Privacy preserving OLAP

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
\ell -Diversity: Privacy Beyond \kappa -Anonymity

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Privacy via pseudorandom sketches

Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Injecting utility into anonymized datasets

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Anatomy: simple and effective privacy preservation

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Maintaining data privacy in association rule mining

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Differential privacy

ICALP'06 Proceedings of the 33rd international conference on Automata, Languages and Programming - Volume Part II
Functional treewidth: bounding complexity in the presence of functional dependencies

SAT'06 Proceedings of the 9th international conference on Theory and Applications of Satisfiability Testing
Calibrating noise to sensitivity in private data analysis

TCC'06 Proceedings of the Third conference on Theory of Cryptography

Private web search

Proceedings of the 2007 ACM workshop on Privacy in electronic society
A learning theory approach to non-interactive database privacy

STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
Dynamic anonymization: accurate statistical analysis with privacy preservation

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Preservation of proximity privacy in publishing numerical sensitive data

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
The cost of privacy: destruction of data-mining utility in anonymized data publishing

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Making Quantitative Measurements of Privacy/Analysis Tradeoffs Inherent to Packet Trace Anonymization

Financial Cryptography and Data Security
ARUBA: A Risk-Utility-Based Algorithm for Data Disclosure

SDM '08 Proceedings of the 5th VLDB workshop on Secure Data Management
Resisting structural re-identification in anonymized social networks

Proceedings of the VLDB Endowment
Structural signatures for tree data structures

Proceedings of the VLDB Endowment
Access control over uncertain data

Proceedings of the VLDB Endowment
Output perturbation with query relaxation

Proceedings of the VLDB Endowment
FRAPP: a framework for high-accuracy privacy-preserving mining

Data Mining and Knowledge Discovery
Detecting privacy violations in database publishing using disjoint queries

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Private coresets

Proceedings of the forty-first annual ACM symposium on Theory of computing
Probabilistic databases: diamonds in the dirt

Communications of the ACM - Barbara Liskov: ACM's A.M. Turing Award Winner
On the tradeoff between privacy and utility in data publishing

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Relationship privacy: output perturbation for queries with joins

Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Attacks on privacy and deFinetti's theorem

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Privacy-Preserving Data Publishing

Foundations and Trends in Databases
Scaling measurement experiments to planet-scale: ethical, regulatory and cultural considerations

Proceedings of the 1st ACM International Workshop on Hot Topics of Planet-Scale Mobility Measurements
Optimal random perturbation at multiple privacy levels

Proceedings of the VLDB Endowment
Publishing naive Bayesian classifiers: privacy without accuracy loss

Proceedings of the VLDB Endowment
The hardness and approximation algorithms for l-diversity

Proceedings of the 13th International Conference on Extending Database Technology
Algorithm-safe privacy-preserving data publishing

Proceedings of the 13th International Conference on Extending Database Technology
Privacy-preserving data publishing: A survey of recent developments

ACM Computing Surveys (CSUR)
A practice-oriented framework for measuring privacy and utility in data sanitization systems

Proceedings of the 2010 EDBT/ICDT Workshops
Towards publishing recommendation data with predictive anonymization

ASIACCS '10 Proceedings of the 5th ACM Symposium on Information, Computer and Communications Security
On the geometry of differential privacy

Proceedings of the forty-second ACM symposium on Theory of computing
The price of privately releasing contingency tables and the spectra of random matrices with correlated rows

Proceedings of the forty-second ACM symposium on Theory of computing
Towards an axiomatization of statistical privacy and utility

Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Differentially private aggregation of distributed time-series with transformation and encryption

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Versatile publishing for privacy preservation

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Rights protection of trajectory datasets with nearest-neighbor preservation

The VLDB Journal — The International Journal on Very Large Data Bases
Extending l-diversity to generalize sensitive data

Data & Knowledge Engineering
Small domain randomization: same privacy, more utility

Proceedings of the VLDB Endowment
Resisting structural re-identification in anonymized social networks

The VLDB Journal — The International Journal on Very Large Data Bases
Privacy-preserving publishing microdata with full functional dependencies

Data & Knowledge Engineering
Privacy-preserving data sharing in cloud computing

Journal of Computer Science and Technology
ASAP: Eliminating algorithm-based disclosure in privacy-preserving data publishing

Information Systems
Anonymization of location data does not work: a large-scale measurement study

MobiCom '11 Proceedings of the 17th annual international conference on Mobile computing and networking
What Can We Learn Privately?

SIAM Journal on Computing
Privacy-preserving publishing data with full functional dependencies

DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part II
Unconditional differentially private mechanisms for linear queries

STOC '12 Proceedings of the forty-fourth annual ACM symposium on Theory of computing
Survey: DNA-inspired information concealing: A survey

Computer Science Review
P-top-k queries in a probabilistic framework from information extraction models

Computers & Mathematics with Applications
Publishing microdata with a robust privacy guarantee

Proceedings of the VLDB Endowment
On location privacy and quality of information in participatory sensing

Proceedings of the 8h ACM symposium on QoS and security for wireless and mobile networks
Efficient discovery of de-identification policy options through a risk-utility frontier

Proceedings of the third ACM conference on Data and application security and privacy
Optimal error of query sets under the differentially-private matrix mechanism

Proceedings of the 16th International Conference on Database Theory
Information preservation in statistical privacy and bayesian estimation of unattributed histograms

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
The geometry of differential privacy: the sparse and approximate cases

Proceedings of the forty-fifth annual ACM symposium on Theory of computing
Negotiation-based privacy preservation scheme in internet of things platform

Proceedings of the First International Conference on Security of Internet of Things
Modeling and Respecting Privacy Specification when Composing DaaS Services*

International Journal of Web Services Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider the privacy problem in data publishing: given a database instance containing sensitive information "anonymize" it to obtain a view such that, on one hand attackers cannot learn any sensitive information from the view, and on the other hand legitimate users can use it to compute useful statistics. These are conflicting goals. In this paper we prove an almost crisp separation of the case when a useful anonymization algorithm is possible from when it is not, based on the attacker's prior knowledge. Our definition of privacy is derived from existing literature and relates the attacker's prior belief for a given tuple t, with the posterior belief for the same tuple. Our definition of utility is based on the error bound on the estimates of counting queries. The main result has two parts. First we show that if the prior beliefs for some tuples are large then there exists no useful anonymization algorithm. Second, we show that when the prior is bounded for all tuples then there exists an anonymization algorithm that is both private and useful. The anonymization algorithm that forms our positive result is novel, and improves the privacy/utility tradeoff of previously known algorithms with privacy/utility guarantees such as FRAPP.