Priority driven k-anonymisation for privacy protection

  • Authors:
  • Xiaoxun Sun;Hua Wang;Jiuyong Li

  • Affiliations:
  • University of Southern Queensland, Toowoomba, Queensland, Australia;University of Southern Queensland, Toowoomba, Queensland, Australia;University of South Australia, Adelaide, Australia

  • Venue:
  • AusDM '08 Proceedings of the 7th Australasian Data Mining Conference - Volume 87
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Given the threat of re-identification in our growing digital society, guaranteeing privacy while providing worthwhile data for knowledge discovery has become a difficult problem. k-anonymity is a major technique used to ensure privacy by generalizing and suppressing attributes and has been the focus of intense research in the last few years. However, data modification techniques like generalization may produce anonymous data unusable for medical studies because some attributes become too coarse-grained. In this paper, we propose a priority driven k-anonymisation that allows to specify the degree of acceptable distortion for each attribute separately. We also define some appropriate metrics to measure the distance and information loss, which are suitable for both numerical and categorical attributes. Further, we formulate the priority driven k-anonymisation as the k-nearest neighbor (KNN) clustering problem by adding a constraint that each cluster contains at least k tuples. We develop an efficient algorithm for priority driven k-anonymisation. Experimental results show that the proposed technique causes significantly less distortions.