A Tree-Based Data Perturbation Approach for Privacy-Preserving Data Mining

  • Authors:
  • Xiao-Bai Li;Sumit Sarkar

  • Affiliations:
  • -;IEEE Computer Society

  • Venue:
  • IEEE Transactions on Knowledge and Data Engineering
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Due to growing concerns about the privacy of personal information, organizations that use their customers' records in data mining activities are forced to take actions to protect the privacy of the individuals. A frequently used disclosure protection method is data perturbation. When used for data mining, it is desirable that perturbation preserves statistical relationships between attributes, while providing adequate protection for individual confidential data. To achieve this goal, we propose a kd-tree based perturbation method, which recursively partitions a data set into smaller subsets such that data records within each subset are more homogeneous after each partition. The confidential data in each final subset are then perturbed using the subset average. An experimental study is conducted to show the effectiveness of the proposed method.