Clustering-Based k-anonymity

  • Authors:
  • Xianmang He;HuaHui Chen;Yefang Chen;Yihong Dong;Peng Wang;Zhenhua Huang

  • Affiliations:
  • School of Information Science and Technology, NingBo University, Ning Bo, P.R. China;School of Information Science and Technology, NingBo University, Ning Bo, P.R. China;School of Information Science and Technology, NingBo University, Ning Bo, P.R. China;School of Information Science and Technology, NingBo University, Ning Bo, P.R. China;School of Computer Science and Technology, Fudan University, Shanghai, P.R. China;School of Electronic and Information Engineering, Tongji University, Shanghai, P.R. China

  • Venue:
  • PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Privacy is one of major concerns when data containing sensitive information needs to be released for ad hoc analysis, which has attracted wide research interest on privacy-preserving data publishing in the past few years. One approach of strategy to anonymize data is generalization. In a typical generalization approach, tuples in a table was first divided into many QI (quasi-identifier)-groups such that the size of each QI-group is no less than k . Clustering is to partition the tuples into many clusters such that the points within a cluster are more similar to each other than points in different clusters. The two methods share a common feature: distribute the tuples into many small groups. Motivated by this observation, we propose a clustering-based k -anonymity algorithm, which achieves k -anonymity through clustering. Extensive experiments on real data sets are also conducted, showing that the utility has been improved by our approach.