A semantic information loss metric for privacy preserving publication

  • Authors:
  • Yu Liu;Ting Wang;Jianhua Feng

  • Affiliations:
  • Department of Computer Science and Technology Tsinghua National Laboratory for Information Science and Technology, Tsinghua University, Beijing, China;Department of Computer Science and Technology Tsinghua National Laboratory for Information Science and Technology, Tsinghua University, Beijing, China;Department of Computer Science and Technology Tsinghua National Laboratory for Information Science and Technology, Tsinghua University, Beijing, China

  • Venue:
  • DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part II
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data distortion is inevitable in privacy-preserving data publication and a lot of quality metrics have been proposed to measure the quality of anonymous data, where information loss metrics are popularly used. Most of existing information loss metrics, however, are non-semantic and hence are limited in reflecting the data distortion. Thus, the utility of anonymous data based on these metrics is constrained. In this paper, we propose a novel semantic information loss metric SILM, which takes into account the correlation among attributes. This new metric can capture the distortion more precisely than the state of art information loss metrics especially for the scenario where strong correlations exist among attributes. We evaluated the effect of SILM on data quality in terms of the accuracy of aggregate query answering and classification. Comprehensive experiments demonstrate that SILM can help improve the quality of anonymous data much more especially if integrated with proper anonymization algorithms.