Privacy-preserving data mining
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
On the design and quantification of privacy preserving data mining algorithms
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Privacy preserving mining of association rules
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Privacy preserving association rule mining in vertically partitioned data
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Building decision tree classifier on private data
CRPIT '14 Proceedings of the IEEE international conference on Privacy, security and data mining - Volume 14
Using randomized response techniques for privacy-preserving data mining
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Anonymity-preserving data collection
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
A Privacy Preserving Markov Model for Sequence Classification
Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics
Privacy-preserving Kruskal-Wallis test
Computer Methods and Programs in Biomedicine
Hi-index | 0.00 |
In this paper, we study the privacy-preserving decision tree building problem on vertically partitioned data. We made two contributions. First, we propose a novel hybrid approach, which takes advantage of the strength of the two existing approaches, randomization and the secure multi-party computation (SMC), to balance the accuracy and efficiency constraints. Compared to these two existing approaches, our proposed approach can achieve much better accuracy than randomization approach and much reduced computation cost than SMC approach. We also propose a multi-group scheme that makes it flexible for data miners to control the balance between data mining accuracy and privacy. We partition attributes into groups, and develop a scheme to conduct group-based randomization to achieve better data mining accuracy. We have implemented and evaluated the proposed schemes for the ID3 decision tree algorithm.