Data Swapping: Balancing Privacy against Precision in Mining for Logic Rules

  • Authors:
  • Vladimir Estivill-Castro;Ljiljana Brankovic

  • Affiliations:
  • -;-

  • Venue:
  • DaWaK '99 Proceedings of the First International Conference on Data Warehousing and Knowledge Discovery
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

The recent proliferation of data mining tools for the analysis of large volumes of data has paid little attention to individual privacy issues. Here, we introduce methods aimed at finding a balance between the individuals' right to privacy and the data-miners' need to find general patterns in huge volumes of detailed records. In particular, we focus on the data-mining task of classification with decision trees. We base our security-control mechanism on noise-addition techniques used in statistical databases because (1) the multidimensional matrix model of statistical databases and the multidimensional cubes of On-Line Analytical Processing (OLAP) are essentially the same, and (2) noise-addition techniques are very robust. The main drawback of noise addition techniques in the context of statistical databases is low statistical quality of released statistics. We argue that in data mining the major requirement of security control mechanism (in addition to protect privacy) is not to ensure precise and bias-free statistics, but rather to preserve the high-level descriptions of knowledge constructed by artificial data mining tools.