Data mining without data: a novel approach to privacy-preserving collaborative distributed data mining

  • Authors:
  • Vikas Ashok;Ravi Mukkamala

  • Affiliations:
  • Old Dominion University, Norfolk, VA, USA;Old Dominion University, Norfolk, VA, USA

  • Venue:
  • Proceedings of the 10th annual ACM workshop on Privacy in the electronic society
  • Year:
  • 2011

Quantified Score

Hi-index 0.01

Visualization

Abstract

With the proliferation of organizations that independently collect various types of data, with the growing awareness of corporations and public to keep their sensitive data private, and with the ever-increasing need of government and corporate policy makers to learn the behavior of their customers, there is a definite demand for data mining services even when the data owners refuse to provide their data directly. In the past, techniques such as random perturbation were used by data owners prior to sharing the data with a third-party data miner. But, as already proven, even these techniques are prone to privacy-violation. In this paper, we take a completely different approach---each data owner derives association rules locally, sanitizes them if necessary, and sends them to a third-party data miner. The data miner collects local rules from all data owners, regenerates an estimate of global data, and performs global data mining. We suggest schemes to reduce the generation of spurious rules, a possible outcome of data generation from rules. The proposed method is illustrated using an example of association rule data mining. We are currently in the process of formalizing some of the underlying techniques and to make them more efficient.