Differential privacy based on importance weighting

Authors:
Zhanglong Ji;Charles Elkan
Affiliations:
Department of Computer Science and Engineering 0404, University of California, San Diego, USA;Department of Computer Science and Engineering 0404, University of California, San Diego, USA
Venue:
Machine Learning
Year:
2013

Citing 19
Cited 0

Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
A Bayesian network framework for reject inference

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
A learning theory approach to non-interactive database privacy

STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
Learning classifiers from only positive and unlabeled data

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Differentially private recommender systems: building privacy into the net

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
A Least-squares Approach to Direct Importance Estimation

The Journal of Machine Learning Research
Optimizing linear counting queries under differential privacy

Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Differentially private aggregation of distributed time-series with transformation and encryption

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Differentially-private network trace analysis

Proceedings of the ACM SIGCOMM 2010 conference
Differentially private data release through multidimensional partitioning

SDM'10 Proceedings of the 7th VLDB conference on Secure data management
A Multiplicative Weights Mechanism for Privacy-Preserving Data Analysis

FOCS '10 Proceedings of the 2010 IEEE 51st Annual Symposium on Foundations of Computer Science
Boosting the accuracy of differentially private histograms through consistency

Proceedings of the VLDB Endowment
Differentially private data cubes: optimizing noise sources and consistency

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Preserving privacy in data mining via importance weighting

PSDML'10 Proceedings of the international ECML/PKDD conference on Privacy and security issues in data mining and machine learning
Differentially private data release for data mining

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Differentially Private Empirical Risk Minimization

The Journal of Machine Learning Research
Compressive mechanism: utilizing sparse representation in differential privacy

Proceedings of the 10th annual ACM workshop on Privacy in the electronic society
Differential privacy

ICALP'06 Proceedings of the 33rd international conference on Automata, Languages and Programming - Volume Part II
Calibrating noise to sensitivity in private data analysis

TCC'06 Proceedings of the Third conference on Theory of Cryptography

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper analyzes a novel method for publishing data while still protecting privacy. The method is based on computing weights that make an existing dataset, for which there are no confidentiality issues, analogous to the dataset that must be kept private. The existing dataset may be genuine but public already, or it may be synthetic. The weights are importance sampling weights, but to protect privacy, they are regularized and have noise added. The weights allow statistical queries to be answered approximately while provably guaranteeing differential privacy. We derive an expression for the asymptotic variance of the approximate answers. Experiments show that the new mechanism performs well even when the privacy budget is small, and when the public and private datasets are drawn from different populations.