C4.5: programs for machine learning
C4.5: programs for machine learning
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Machine Learning
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
State-of-the-art in privacy preserving data mining
ACM SIGMOD Record
Privacy-Preserving Data Mining: Models and Algorithms
Privacy-Preserving Data Mining: Models and Algorithms
Attacks on privacy and deFinetti's theorem
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Protecting individual information against inference attacks in data publishing
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Hi-index | 0.00 |
In this paper, we introduce a novel inference attack that we coin as the reconstruction attack whose objective is to reconstruct a probabilistic version of the original dataset on which a classifier was learnt from the description of this classifier and possibly some auxiliary information. In a nutshell, the reconstruction attack exploits the structure of the classifier in order to derive a probabilistic version of dataset on which this model has been trained. Moreover, we propose a general framework that can be used to assess the success of a reconstruction attack in terms of a novel distance between the reconstructed and original datasets. In case of multiple releases of classifiers, we also give a strategy that can be used to merge the different reconstructed datasets into a single coherent one that is closer to the original dataset than any of the simple reconstructed datasets. Finally, we give an instantiation of this reconstruction attack on a decision tree classifier that was learnt using the algorithm C4.5 and evaluate experimentally its efficiency. The results of this experimentation demonstrate that the proposed attack is able to reconstruct a significant part of the original dataset, thus highlighting the need to develop new learning algorithms whose output is specifically tailored to mitigate the success of this type of attack.