Information preservation in statistical privacy and bayesian estimation of unattributed histograms

Authors:
Bing-Rong Lin;Daniel Kifer
Affiliations:
Penn State University, University Park, PA, USA;Penn State University, University Park, PA, USA
Venue:
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Year:
2013

Citing 32
Cited 0

Security-control methods for statistical databases: a comparative study

ACM Computing Surveys (CSUR)
The uncertain reasoner's companion: a mathematical perspective

The uncertain reasoner's companion: a mathematical perspective
Revealing information while preserving privacy

Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Privacy preserving mining of association rules

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Transforming data to satisfy privacy constraints

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Convex Optimization

Convex Optimization
Practical privacy: the SuLQ framework

Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Pattern Recognition and Machine Learning (Information Science and Statistics)

Pattern Recognition and Machine Learning (Information Science and Statistics)
Anatomy: simple and effective privacy preservation

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Privacy, accuracy, and consistency too: a holistic solution to contingency table release

Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
The boundary between privacy and utility in data publishing

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Minimality attack in privacy preserving data publishing

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Mechanism Design via Differential Privacy

FOCS '07 Proceedings of the 48th Annual IEEE Symposium on Foundations of Computer Science
A learning theory approach to non-interactive database privacy

STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
Universally utility-maximizing privacy mechanisms

Proceedings of the forty-first annual ACM symposium on Theory of computing
On the complexity of differentially private data release: efficient algorithms and hardness results

Proceedings of the forty-first annual ACM symposium on Theory of computing
On Unifying Privacy and Uncertain Data Models

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Using Anonymized Data for Classification

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Privacy integrated queries: an extensible platform for privacy-preserving data analysis

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Privacy-Preserving Data Publishing

Foundations and Trends in Databases
Privacy-preserving data publishing: A survey of recent developments

ACM Computing Surveys (CSUR)
Optimizing linear counting queries under differential privacy

Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Universally optimal privacy mechanisms for minimax agents

Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Towards an axiomatization of statistical privacy and utility

Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Boosting the accuracy of differentially private histograms through consistency

Proceedings of the VLDB Endowment
Differentially private data cubes: optimizing noise sources and consistency

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Selling privacy at auction

Proceedings of the 12th ACM conference on Electronic commerce
On the relation between differential privacy and quantitative information flow

ICALP'11 Proceedings of the 38th international conference on Automata, languages and programming - Volume Part II
An information theoretic privacy and utility measure for data sanitization mechanisms

Proceedings of the second ACM conference on Data and Application Security and Privacy
Calibrating noise to sensitivity in private data analysis

TCC'06 Proceedings of the Third conference on Theory of Cryptography
A workflow for differentially-private graph synthesis

Proceedings of the 2012 ACM workshop on Workshop on online social networks
A theory of pricing private data

Proceedings of the 16th International Conference on Database Theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

In statistical privacy, utility refers to two concepts: information preservation -- how much statistical information is retained by a sanitizing algorithm, and usability -- how (and with how much difficulty) does one extract this information to build statistical models, answer queries, etc. Some scenarios incentivize a separation between information preservation and usability, so that the data owner first chooses a sanitizing algorithm to maximize a measure of information preservation and, afterward, the data consumers process the sanitized output according to their needs [22, 46]. We analyze a variety of utility measures and show that the average (over possible outputs of the sanitizer) error of Bayesian decision makers forms the unique class of utility measures that satisfy three axioms related to information preservation. The axioms are agnostic to Bayesian concepts such as subjective probabilities and hence strengthen support for Bayesian views in privacy research. In particular, this result connects information preservation to aspects of usability -- if the information preservation of a sanitizing algorithm should be measured as the average error of a Bayesian decision maker, shouldn't Bayesian decision theory be a good choice when it comes to using the sanitized outputs for various purposes? We put this idea to the test in the unattributed histogram problem where our decision- theoretic post-processing algorithm empirically outperforms previously proposed approaches.