Finding overlapping communities in social networks: toward a rigorous approach

Authors:
Sanjeev Arora;Rong Ge;Sushant Sachdeva;Grant Schoenebeck
Affiliations:
Princeton University, Princeton, NJ, USA;Princeton University, Princeton, NJ, USA;Princeton University, Princeton, NJ, USA;Princeton University, Princeton, NJ, USA
Venue:
Proceedings of the 13th ACM Conference on Electronic Commerce
Year:
2012

Citing 10
Cited 2

Property testing and its connection to learning and approximation

Journal of the ACM (JACM)
Polynomial time approximation schemes for dense instances of NP -hard problems

Journal of Computer and System Sciences
Stochastic models for the Web graph

FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Spectral Partitioning of Random Graphs

FOCS '01 Proceedings of the 42nd IEEE symposium on Foundations of Computer Science
Mixed Membership Stochastic Blockmodels

The Journal of Machine Learning Research
Affiliation networks

Proceedings of the forty-first annual ACM symposium on Theory of computing
Relational learning via latent social dimensions

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Social and Economic Networks

Social and Economic Networks
Clustering social networks

WAW'07 Proceedings of the 5th international conference on Algorithms and models for the web-graph
Advantage of overlapping clusters for minimizing conductance

LATIN'12 Proceedings of the 10th Latin American international conference on Theoretical Informatics

Hierarchical community decomposition via oblivious routing techniques

Proceedings of the first ACM conference on Online social networks
Modeling and detecting community hierarchies

SIMBAD'13 Proceedings of the Second international conference on Similarity-Based Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

A community in a social network is usually understood to be a group of nodes more densely connected with each other than with the rest of the network. This is an important concept in most domains where networks arise: social, technological, biological, etc. For many years algorithms for finding communities implicitly assumed communities are nonoverlapping (leading to use of clustering-based approaches) but there is increasing interest in finding overlapping communities. A barrier to finding communities is that the solution concept is often defined in terms of an NP-complete problem such as Clique or Hierarchical Clustering. This paper seeks to initiate a rigorous approach to the problem of finding overlapping communities, where "rigorous" means that we clearly state the following: (a) the object sought by our algorithm (b) the assumptions about the underlying network (c) the (worst-case) running time. The key contribution of this work is the distillation of the prior sociology studies into general assumptions that at once accord well with sociology research and the current understanding of social networks while allowing computationally efficient solutions. Our assumptions about the network lie between worst-case and average-case. An average-case analysis would require a precise probabilistic model of the network, on which there is currently no consensus. However, some plausible assumptions about network parameters can be gleaned from a long body of work in the sociology community spanning five decades focusing on the study of individual communities and ego-centric networks (in graph theoretic terms, this is the subgraph induced on a node's neighborhood). Thus our assumptions are somewhat "local" in nature. Nevertheless they suffice to permit a rigorous analysis of running time of algorithms that recover global structure. Our algorithms use random sampling similar to that in property testing and algorithms for dense graphs. We note however that our networks are not necessarily dense graphs, not even in local neighborhoods. Our algorithms explore a local-global relationship between ego-centric and socio-centric networks that we hope will provide a fruitful framework for future work both in computer science and sociology.