Information theoretic criteria for community detection

Authors:
L. Karl Branting
Affiliations:
The MITRE Corporation, McLean, VA
Venue:
SNAKDD'08 Proceedings of the Second international conference on Advances in social network mining and analysis
Year:
2008

Citing 7
Cited 0

Clustering Large Datasets in Arbitrary Metric Spaces

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
AutoPart: parameter-free graph partitioning and outlier detection

PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
Graph clustering with network structure indices

Proceedings of the 24th international conference on Machine learning
GraphScope: parameter-free mining of large time-evolving graphs

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Community detection in large-scale social networks

Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis
Probabilistic community discovery using hierarchical latent Gaussian mixture model

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Power-Law Distributions in Empirical Data

SIAM Review

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many algorithms for finding community structure in graphs search for a partition that maximizes modularity. However, recent work has identified two important limitations of modularity as a community quality criterion: a resolution limit; and a bias towards finding equal-sized communities. Information-theoretic approaches that search for partitions that minimize description length are a recent alternative to modularity. This paper shows that two information-theoretic algorithms are themselves subject to a resolution limit, identifies the component of each approach that is responsible for the resolution limit, proposes a variant, SGE (Sparse Graph Encoding), that addresses this limitation, and demonstrates on three artificial data sets that (1) SGE does not exhibit a resolution limit on sparse graphs in which other approaches do, and that (2) modularity and the compression-based algorithms, including SGE, behave similarly on graphs not subject to the resolution limit.