Data mining without data: a novel approach to privacy-preserving collaborative distributed data mining

Authors:
Vikas Ashok;Ravi Mukkamala
Affiliations:
Old Dominion University, Norfolk, VA, USA;Old Dominion University, Norfolk, VA, USA
Venue:
Proceedings of the 10th annual ACM workshop on Privacy in the electronic society
Year:
2011

Citing 9
Cited 0

Using unknowns to prevent discovery of association rules

ACM SIGMOD Record
k-anonymity: a model for protecting privacy

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Privacy preserving mining of association rules

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Privacy preserving association rule mining in vertically partitioned data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
An architecture for privacy-preserving mining of client information

CRPIT '14 Proceedings of the IEEE international conference on Privacy, security and data mining - Volume 14
Privacy-Preserving Distributed Mining of Association Rules on Horizontally Partitioned Data

IEEE Transactions on Knowledge and Data Engineering
Random Projection-Based Multiplicative Data Perturbation for Privacy Preserving Distributed Data Mining

IEEE Transactions on Knowledge and Data Engineering
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Next Generation of Data Mining

Next Generation of Data Mining

Quantified Score

Hi-index	0.01

Visualization

Abstract

With the proliferation of organizations that independently collect various types of data, with the growing awareness of corporations and public to keep their sensitive data private, and with the ever-increasing need of government and corporate policy makers to learn the behavior of their customers, there is a definite demand for data mining services even when the data owners refuse to provide their data directly. In the past, techniques such as random perturbation were used by data owners prior to sharing the data with a third-party data miner. But, as already proven, even these techniques are prone to privacy-violation. In this paper, we take a completely different approach---each data owner derives association rules locally, sanitizes them if necessary, and sends them to a third-party data miner. The data miner collects local rules from all data owners, regenerates an estimate of global data, and performs global data mining. We suggest schemes to reduce the generation of spurious rules, a possible outcome of data generation from rules. The proposed method is illustrated using an example of association rule data mining. We are currently in the process of formalizing some of the underlying techniques and to make them more efficient.