Hierarchical, Parameter-Free Community Discovery

  • Authors:
  • Spiros Papadimitriou;Jimeng Sun;Christos Faloutsos;Philip S. Yu

  • Affiliations:
  • IBM T.J. Watson Research Center, Hawthorne, USA;IBM T.J. Watson Research Center, Hawthorne, USA;Carnegie Mellon University, Pittsburgh, USA;University of Illinois, Chicago, USA

  • Venue:
  • ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Given a large bipartite graph (like document-term, or userproduct graph), how can we find meaningful communities, quickly, and automatically? We propose to look for community hierarchies, with communities- within-communities. Our proposed method, the Context-specific Cluster Tree (CCT)finds such communities at multiple levels, with no user intervention, based on information theoretic principles (MDL). More specifically, it partitions the graph into progressively more refined subgraphs, allowing users to quickly navigate from the global, coarse structure of a graph to more focused and local patterns. As a fringe benefit, and also as an additional indication of its quality, it also achieves better compression than typical, non-hierarchical methods. We demonstrate its scalability and effectiveness on real, large graphs.