A novel anonymization algorithm: Privacy protection and knowledge preservation

  • Authors:
  • Weijia Yang;Sanzheng Qiao

  • Affiliations:
  • Department of Computer Science, Shanghai Jiao Tong University, Shanghai 200030, China;Department of Computing and Software, McMaster University, Hamilton, Ont., Canada L8S 4K1

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2010

Quantified Score

Hi-index 12.05

Visualization

Abstract

In data mining and knowledge discovery, there are two conflicting goals: privacy protection and knowledge preservation. On the one hand, we anonymize data to protect privacy; on the other hand, we allow miners to discover useful knowledge from anonymized data. In this paper, we present an anonymization method which provides both privacy protection and knowledge preservation. Unlike most anonymization methods, where data are generalized or permuted, our method anonymizes data by randomly breaking links among attribute values in records. By data randomization, our method maintains statistical relations among data to preserve knowledge, whereas in most anonymization methods, knowledge is lost. Thus the data anonymized by our method maintains useful knowledge for statistical study. Furthermore, we propose an enhanced algorithm for extra privacy protection to tackle the situation where the user's prior knowledge of original data may cause privacy leakage. The privacy levels and the accuracy of knowledge preservation of our method, along with their relations to the parameters in the method are analyzed. Experiment results demonstrate that our method is effective on both privacy protection and knowledge preservation comparing with existing methods.