Privacy-preserving data mining
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Data mining: concepts and techniques
Data mining: concepts and techniques
Data Mining for Scientific and Engineering Applications
Data Mining for Scientific and Engineering Applications
Privacy Preserving Data Mining (Advances in Information Security)
Privacy Preserving Data Mining (Advances in Information Security)
Mondrian Multidimensional K-Anonymity
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
A Tree-Based Data Perturbation Approach for Privacy-Preserving Data Mining
IEEE Transactions on Knowledge and Data Engineering
Anatomy: simple and effective privacy preservation
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Minimizing Information Loss and Preserving Privacy
Management Science
Maximizing Accuracy of Shared Databases when Concealing Sensitive Patterns
Information Systems Research
Privacy Protection in Data Mining: A Perturbation Approach for Categorical Data
Information Systems Research
A tree-based approach to preserve the privacy of software engineering data and predictive models
PROMISE '09 Proceedings of the 5th International Conference on Predictor Models in Software Engineering
Sharing graphs using differentially private graph models
Proceedings of the 2011 ACM SIGCOMM conference on Internet measurement conference
Hi-index | 0.00 |
Data mining techniques have been widely used in many research disciplines such as medicine, life sciences, and social sciences to extract useful knowledge (such as mining models) from research data. Research data often needs to be published along with the data mining model for verification or reanalysis. However, the privacy of the published data needs to be protected because otherwise the published data is subject to misuse such as linking attacks. Therefore, employing various privacy protection methods becomes necessary. However, these methods only consider privacy protection and do not guarantee that the same mining models can be built from sanitized data. Thus the published models cannot be verified using the sanitized data. This article proposes a technique that not only protects privacy, but also guarantees that the same model, in the form of decision trees or regression trees, can be built from the sanitized data. We have also experimentally shown that other mining techniques can be used to reanalyze the sanitized data. This technique can be used to promote sharing of research data.