Optimizing Privacy-Accuracy Tradeoff for Privacy Preserving Distance-Based Classification

  • Authors:
  • Aryya Gangopadhyay;Zhiyuan Chen;Dongjin Kim

  • Affiliations:
  • University of Maryland Baltimore County, USA;University of Maryland Baltimore County, USA;University of Maryland Baltimore County, USA

  • Venue:
  • International Journal of Information Security and Privacy
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Privacy concerns often prevent organizations from sharing data for data mining purposes. There has been a rich literature on privacy preserving data mining techniques that can protect privacy and still allow accurate mining. Many such techniques have some parameters that need to be set correctly to achieve the desired balance between privacy protection and quality of mining results. However, there has been little research on how to tune these parameters effectively. This paper studies the problem of tuning the group size parameter for a popular privacy preserving distance-based mining technique: the condensation method. The contributions include: 1 a class-wise condensation method that selects an appropriate group size based on heuristics and avoids generating groups with mixed classes, 2 a rule-based approach that uses binary search and several rules to further optimize the setting for the group size parameter. The experimental results demonstrate the effectiveness of the authors' approach.