A separability framework for analyzing community structure

Authors:
Bruno Abrahao;Sucheta Soundarajan;John Hopcroft;Robert Kleinberg
Affiliations:
Cornell University, Ithaca, USA;Cornell University, Ithaca, USA;Cornell University, Ithaca, USA;Cornell University, Ithaca, USA
Venue:
ACM Transactions on Knowledge Discovery from Data (TKDD) - Casin special issue
Year:
2014

Citing 19
Cited 0

Instance-Based Learning Algorithms

Machine Learning
A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs

SIAM Journal on Scientific Computing
The Two-Point Correlation Function: A Measure of Interclass Separability

Journal of Mathematical Imaging and Vision
The dynamics of viral marketing

EC '06 Proceedings of the 7th ACM conference on Electronic commerce
Group formation in large social networks: membership, growth, and evolution

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Statistical properties of community structure in large social and information networks

Proceedings of the 17th international conference on World Wide Web
Graph Clustering Via a Discrete Uncoupling Process

SIAM Journal on Matrix Analysis and Applications
A Fast Algorithm to Find Overlapping Communities in Networks

ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
Pattern Recognition, Fourth Edition

Pattern Recognition, Fourth Edition
Detecting Overlapping Community Structures in Networks

World Wide Web
MetaFac: community discovery via relational hypergraph factorization

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Isolation concepts for efficiently enumerating dense subgraphs

Theoretical Computer Science
You are who you know: inferring user profiles in online social networks

Proceedings of the third ACM international conference on Web search and data mining
Empirical comparison of algorithms for network community detection

Proceedings of the 19th international conference on World wide web
A classification for community discovery methods in complex networks

Statistical Analysis and Data Mining
Stochastic local clustering for massive graphs

PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Efficient identification of overlapping communities

ISI'05 Proceedings of the 2005 IEEE international conference on Intelligence and Security Informatics
On the separability of structural classes of communities

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Defining and Evaluating Network Communities Based on Ground-Truth

ICDM '12 Proceedings of the 2012 IEEE 12th International Conference on Data Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Four major factors govern the intricacies of community extraction in networks: (1) the literature offers a multitude of disparate community detection algorithms whose output exhibits high structural variability across the collection, (2) communities identified by algorithms may differ structurally from real communities that arise in practice, (3) there is no consensus characterizing how to discriminate communities from noncommunities, and (4) the application domain includes a wide variety of networks of fundamentally different natures. In this article, we present a class separability framework to tackle these challenges through a comprehensive analysis of community properties. Our approach enables the assessment of the structural dissimilarity among the output of multiple community detection algorithms and between the output of algorithms and communities that arise in practice. In addition, our method provides us with a way to organize the vast collection of community detection algorithms by grouping those that behave similarly. Finally, we identify the most discriminative graph-theoretical properties of community signature and the small subset of properties that account for most of the biases of the different community detection algorithms. We illustrate our approach with an experimental analysis, which reveals nuances of the structure of real and extracted communities. In our experiments, we furnish our framework with the output of 10 different community detection procedures, representative of categories of popular algorithms available in the literature, applied to a diverse collection of large-scale real network datasets whose domains span biology, online shopping, and social systems. We also analyze communities identified by annotations that accompany the data, which reflect exemplar communities in various domain. We characterize these communities using a broad spectrum of community properties to produce the different structural classes. As our experiments show that community structure is not a universal concept, our framework enables an informed choice of the most suitable community detection method for identifying communities of a specific type in a given network and allows for a comparison of existing community detection algorithms while guiding the design of new ones.