A framework for privacy preserving classification in data mining

Authors:
Md. Zahidul Islam;Ljiljana Brankovic
Affiliations:
The University of Newcastle, Callaghan, NSW, Australia;The University of Newcastle, Callaghan, NSW, Australia
Venue:
ACSW Frontiers '04 Proceedings of the second workshop on Australasian information security, Data Mining and Web Intelligence, and Software Internationalisation - Volume 32
Year:
2004

Citing 9
Cited 5

Security-control methods for statistical databases: a comparative study

ACM Computing Surveys (CSUR)
A modified random perturbation method for database security

ACM Transactions on Database Systems (TODS)
An introduction to database systems (7th ed.)

An introduction to database systems (7th ed.)
A General Additive Data Perturbation Method for Database Security

Management Science
Privacy-preserving data mining

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms

Machine Learning
Building decision tree classifier on private data

CRPIT '14 Proceedings of the IEEE international conference on Privacy, security and data mining - Volume 14
Foundations for an access control model for privacy preservation in multi-relational association rule mining

CRPIT '14 Proceedings of the IEEE international conference on Privacy, security and data mining - Volume 14
Instability of decision tree classification algorithms

Instability of decision tree classification algorithms

Automatic Fuzzy Ontology Generation for Semantic Web

IEEE Transactions on Knowledge and Data Engineering
Anonymity preserving pattern discovery

The VLDB Journal — The International Journal on Very Large Data Bases
Privacy Preserving Data Mining Research: Current Status and Key Issues

ICCS '07 Proceedings of the 7th international conference on Computational Science, Part III: ICCS 2007
Relationships and data sanitization: a study in scarlet

Proceedings of the 2010 workshop on New security paradigms
Hiding classification rules for data sharing with privacy preservation

DaWaK'05 Proceedings of the 7th international conference on Data Warehousing and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

Nowadays organizations all over the world are dependent on mining gigantic datasets. These datasets typically contain delicate individual information, which inevitably gets exposed to different parties. Consequently privacy issues are constantly under the limelight and the public dissatisfaction may well threaten the exercise of data mining and all its benefits. It is thus of great importance to develop adequate security techniques for protecting confidentiality of individual values used for data mining.In the last 30 years several techniques have been proposed in the context of statistical databases. It was noticed early on that non-careful noise addition introduces biases to statistical parameters, including means, variances and covariances, and sophisticated techniques that avoid biases were developed. However, when these techniques are applied in the context of data mining, they do not appear to be bias-free. Wilson and Rosen (2002) suggest the existence of Type Data Mining (DM) bias that relates to the loss of underlying patters in the database and cannot be eliminated by preserving simple statistical parameters. In this paper we propose a noise addition framework specifically tailored towards the classification task in data mining. It builds upon some previous techniques that introduce noise to the class and the so-called innocent attributes. Our framework extends these techniques to the influential attributes; additionally, it caters for the preservation of the variances and covariances, along with patterns, thus making the perturbed dataset useful for both statistical and data mining purposes. Our preliminary experimental results indicate that data patterns are highly preserved suggesting the non-existence of DM bias.