On the Hardness of Graph Anonymization

  • Authors:
  • Charu C. Aggarwal;Yao Li;Philip S. Yu

  • Affiliations:
  • -;-;-

  • Venue:
  • ICDM '11 Proceedings of the 2011 IEEE 11th International Conference on Data Mining
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we examine the problem of node re-identification from anonymized graphs. Typical graphs encountered in real applications are massive and sparse. In this paper, we will show that massive and sparse graphs have certain theoretical properties which make them susceptible to re-identification attacks. We design a systematic way to exploit these theoretical properties in order to construct {\em re-identification signatures}, which are also known as characteristic vectors. These signatures have the property that they are extremely robust to perturbations, especially for massive and sparse graphs. Our results show that even low levels of anonymization require perturbation levels which are significant enough to result in a massive loss of utility. Our experimental results also show that the true anonymization level of graphs is much lower than is implied by measures such as $k$-anonymity. Thus, the results of this paper establish that the problem of massive graph anonymization has fundamental theoretical barriers which prevent a fully effective solution.