Characterizing the sample complexity of private learners

Authors:
Amos Beimel;Kobbi Nissim;Uri Stemmer
Affiliations:
Ben-Gurion University, Beer Sheva, Israel;Ben-Gurion University & Harvard University, Beer Sheva, Israel;Ben-Gurion University, Beer Sheva, Israel
Venue:
Proceedings of the 4th conference on Innovations in Theoretical Computer Science
Year:
2013

Citing 15
Cited 1

A theory of the learnable

Communications of the ACM
A general lower bound on the number of examples needed for learning

Information and Computation
Learnability and the Vapnik-Chervonenkis dimension

Journal of the ACM (JACM)
The Strength of Weak Learnability

Machine Learning
A decision-theoretic generalization of on-line learning and an application to boosting

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Practical privacy: the SuLQ framework

Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Mechanism Design via Differential Privacy

FOCS '07 Proceedings of the 48th Annual IEEE Symposium on Foundations of Computer Science
A learning theory approach to non-interactive database privacy

STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
What Can We Learn Privately?

FOCS '08 Proceedings of the 2008 49th Annual IEEE Symposium on Foundations of Computer Science
The Differential Privacy Frontier (Extended Abstract)

TCC '09 Proceedings of the 6th Theory of Cryptography Conference on Theory of Cryptography
Private Approximation of Search Problems

SIAM Journal on Computing
A firm foundation for private data analysis

Communications of the ACM
Boosting and Differential Privacy

FOCS '10 Proceedings of the 2010 IEEE 51st Annual Symposium on Foundations of Computer Science
Bounds on the sample complexity for private learning and private data release

TCC'10 Proceedings of the 7th international conference on Theory of Cryptography
Calibrating noise to sensitivity in private data analysis

TCC'06 Proceedings of the Third conference on Theory of Cryptography

Bounds on the sample complexity for private learning and private data release

Machine Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

In 2008, Kasiviswanathan el al. defined private learning as a combination of PAC learning and differential privacy [16]. Informally, a private learner is applied to a collection of labeled individual information and outputs a hypothesis while preserving the privacy of each individual. Kasiviswanathan et al. gave a generic construction of private learners for (finite) concept classes, with sample complexity logarithmic in the size of the concept class. This sample complexity is higher than what is needed for non-private learners, hence leaving open the possibility that the sample complexity of private learning may be sometimes significantly higher than that of non-private learning. We give a combinatorial characterization of the sample size sufficient and necessary to privately learn a class of concepts. This characterization is analogous to the well known characterization of the sample complexity of non-private learning in terms of the VC dimension of the concept class. We introduce the notion of probabilistic representation of a concept class, and our new complexity measure RepDim corresponds to the size of the smallest probabilistic representation of the concept class. We show that any private learning algorithm for a concept class C with sample complexity m implies RepDim(C) = O(m), and that there exists a private learning algorithm with sample complexity m = O(RepDim(C)). We further demonstrate that a similar characterization holds for the database size needed for privately computing a large class of optimization problems and also for the well studied problem of private data release.