Fast clustering-based anonymization approaches with time constraints for data streams

  • Authors:
  • Kun Guo;Qishan Zhang

  • Affiliations:
  • College of Mathematics and Computer Science, Fuzhou University, Fuzhou, Fujian 350108, PR China;Management School, Fuzhou University, Fuzhou, Fujian 350108, PR China

  • Venue:
  • Knowledge-Based Systems
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Research on the anonymization of static data has made great progress in recent years. Generalization and suppression are two common technologies for quasi-identifiers' anonymization. However, the characteristics of data streams, such as potential infinity and high dynamicity, make the anonymization of data streams different from the anonymization of static data. The methods for static data anonymization cannot be directly applied to anonymizing data streams. In this paper, a novel k-anonymization approach for data streams based on clustering is proposed. In order to speed up the anonymization process and reduce the information loss, the new approach scans a stream in one turn to recognize and reuse the clusters satisfying the k-anonymity principle. The time constraints on tuple publication and cluster reuse, which are specific to data streams, are considered as well. Furthermore, the approach is improved to conform to the @?-diversity principle. The experiments conducted on the real datasets show that the proposed methods are both efficient and effective.