Security-control methods for statistical databases: a comparative study
ACM Computing Surveys (CSUR)
A modified random perturbation method for database security
ACM Transactions on Database Systems (TODS)
An introduction to database systems (7th ed.)
An introduction to database systems (7th ed.)
A General Additive Data Perturbation Method for Database Security
Management Science
Privacy-preserving data mining
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Building decision tree classifier on private data
CRPIT '14 Proceedings of the IEEE international conference on Privacy, security and data mining - Volume 14
CRPIT '14 Proceedings of the IEEE international conference on Privacy, security and data mining - Volume 14
Instability of decision tree classification algorithms
Instability of decision tree classification algorithms
Automatic Fuzzy Ontology Generation for Semantic Web
IEEE Transactions on Knowledge and Data Engineering
Anonymity preserving pattern discovery
The VLDB Journal — The International Journal on Very Large Data Bases
Privacy Preserving Data Mining Research: Current Status and Key Issues
ICCS '07 Proceedings of the 7th international conference on Computational Science, Part III: ICCS 2007
Relationships and data sanitization: a study in scarlet
Proceedings of the 2010 workshop on New security paradigms
Hiding classification rules for data sharing with privacy preservation
DaWaK'05 Proceedings of the 7th international conference on Data Warehousing and Knowledge Discovery
Hi-index | 0.00 |
Nowadays organizations all over the world are dependent on mining gigantic datasets. These datasets typically contain delicate individual information, which inevitably gets exposed to different parties. Consequently privacy issues are constantly under the limelight and the public dissatisfaction may well threaten the exercise of data mining and all its benefits. It is thus of great importance to develop adequate security techniques for protecting confidentiality of individual values used for data mining.In the last 30 years several techniques have been proposed in the context of statistical databases. It was noticed early on that non-careful noise addition introduces biases to statistical parameters, including means, variances and covariances, and sophisticated techniques that avoid biases were developed. However, when these techniques are applied in the context of data mining, they do not appear to be bias-free. Wilson and Rosen (2002) suggest the existence of Type Data Mining (DM) bias that relates to the loss of underlying patters in the database and cannot be eliminated by preserving simple statistical parameters. In this paper we propose a noise addition framework specifically tailored towards the classification task in data mining. It builds upon some previous techniques that introduce noise to the class and the so-called innocent attributes. Our framework extends these techniques to the influential attributes; additionally, it caters for the preservation of the variances and covariances, along with patterns, thus making the perturbed dataset useful for both statistical and data mining purposes. Our preliminary experimental results indicate that data patterns are highly preserved suggesting the non-existence of DM bias.