The union-split algorithm and cluster-based anonymization of social networks

  • Authors:
  • Brian Thompson;Danfeng Yao

  • Affiliations:
  • Rutgers University, Piscataway, NJ;Rutgers University, Piscataway, NJ

  • Venue:
  • Proceedings of the 4th International Symposium on Information, Computer, and Communications Security
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Knowledge discovery on social network data can uncover latent social trends and produce valuable findings that benefit the welfare of the general public. A growing amount of research finds that social networks play a surprisingly powerful role in people's behaviors. Before the social network data can be released for research purposes, the data needs to be anonymized to prevent potential re-identification attacks. Most of the existing anonymization approaches were developed for relational data, and cannot be used to handle social network data directly. In this paper, we model social networks as undirected graphs and formally define privacy models, attack models for the anonymization problem, in particular an i-hop degree-based anonymization problem, i.e., the adversary's prior knowledge includes the target's degree and the degrees of neighbors within i hops from the target. We present two new and efficient clustering methods for undirected graphs: bounded t-means clustering and union-split clustering algorithms that group similar graph nodes into clusters with a minimum size constraint. These clustering algorithms are contributions beyond the specific social network problems studied and can be used to cluster general data types besides graph vertices. We also develop a simple-yet-effective inter-cluster matching method for anonymizing social networks by strategically adding and removing edges based on nodes' social roles. We carry out a series of experiments to evaluate the graph utilities of the anonymized social networks produced by our algorithms.