A Practical Differentially Private Random Decision Tree Classifier

Authors:
Geetha Jagannathan;Krishnan Pillaipakkamnatt;Rebecca N. Wright
Affiliations:
Department of Computer Science/ Columbia University/ NY/ USA. e-mail: geetha@cs.columbia.edu;Department of Computer Science/ Hofstra University/ Hempstead/ NY/ USA. e-mail: csckzp@hofstra.edu;Department of Computer Science/ Rutgers University/ New Brunswick/ NJ/ USA. e-mail: rebecca.wright@rutgers.edu
Venue:
Transactions on Data Privacy
Year:
2012

Citing 27
Cited 1

Security-control methods for statistical databases: a comparative study

ACM Computing Surveys (CSUR)
Induction of decision trees

Readings in knowledge acquisition and learning
Privacy-preserving data mining

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Protecting Respondents' Identities in Microdata Release

IEEE Transactions on Knowledge and Data Engineering
Induction of Decision Trees

Machine Learning
Revealing information while preserving privacy

Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
k-anonymity: a model for protecting privacy

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Is random model better? On its accuracy and efficiency

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Foundations of Cryptography: Volume 2, Basic Applications

Foundations of Cryptography: Volume 2, Basic Applications
Practical privacy: the SuLQ framework

Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Effective Estimation of Posterior Probabilities: Explaining the Accuracy of Randomized Decision Tree Approaches

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
L-diversity: Privacy beyond k-anonymity

ACM Transactions on Knowledge Discovery from Data (TKDD)
Mechanism Design via Differential Privacy

FOCS '07 Proceedings of the 48th Annual IEEE Symposium on Foundations of Computer Science
A learning theory approach to non-interactive database privacy

STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
Composition attacks and auxiliary information in data privacy

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
What Can We Learn Privately?

FOCS '08 Proceedings of the 2008 49th Annual IEEE Symposium on Foundations of Computer Science
The Differential Privacy Frontier (Extended Abstract)

TCC '09 Proceedings of the 6th Theory of Cryptography Conference on Theory of Cryptography
Privacy integrated queries: an extensible platform for privacy-preserving data analysis

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
On the optimality of probability estimation by random decision trees

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
A Practical Differentially Private Random Decision Tree Classifier

ICDMW '09 Proceedings of the 2009 IEEE International Conference on Data Mining Workshops
Differential privacy: a survey of results

TAMC'08 Proceedings of the 5th international conference on Theory and applications of models of computation
Data mining with differential privacy

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
A firm foundation for private data analysis

Communications of the ACM
Differential privacy in new settings

SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
Boosting and Differential Privacy

FOCS '10 Proceedings of the 2010 IEEE 51st Annual Symposium on Foundations of Computer Science
Calibrating noise to sensitivity in private data analysis

TCC'06 Proceedings of the Third conference on Theory of Cryptography

Group decision making with distance measures and probabilistic information

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we study the problem of constructing private classifiers using decision trees, within the framework of differential privacy. We first present experimental evidence that creating a differentially private ID3 tree using differentially private low-level queries does not simultaneously provide good privacy and good accuracy, particularly for small datasets. In search of better privacy and accuracy, we then present a differentially private decision tree ensemble algorithm based on random decision trees. We demonstrate experimentally that this approach yields good prediction while maintaining good privacy, even for small datasets. We also present differentially private extensions of our algorithm to two settings: (1) new data is periodically appended to an existing database and (2) the database is horizontally or vertically partitioned between multiple users.