A distributed algorithm for community detection in large graphs

Authors:
Harris Papadakis;Costas Panagiotakis;Paraskevi Fragopoulou
Affiliations:
Technological Educational Institute of Crete, Crete, Greece;Technological Educational Institute of Crete, Crete, Greece;Technological Educational Institute of Crete, Crete, Greece
Venue:
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Year:
2013

Citing 3
Cited 0

A Robust Competitive Clustering Algorithm With Applications in Computer Vision

IEEE Transactions on Pattern Analysis and Machine Intelligence
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions

The Journal of Machine Learning Research
Vivaldi: a decentralized network coordinate system

Proceedings of the 2004 conference on Applications, technologies, architectures, and protocols for computer communications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Networks in various application domains present an internal structure, where nodes form groups of tightly connected components which are more loosely connected to the rest of the network. Several attempts have been made to provide a formal definition to the generally described "community finding" concept, providing different approaches. Some algorithms follow an iterative approach starting by characterizing either the entire network, or each individual node as community, and splitting [1] or merging communities respectively, producing a hierarchical tree of nested communities, called dendrogram. Several researchers aim to find the entire hierarchical community dendrogram [1] while others wish to identify only the optimal community partition. Some researchers aim at discovering distinct (non-overlapping) communities, while others allow for overlaps [2]. The Blondel algorithm described by Blondel et al. in [3], follows a bottom-up approach. Each node in the graph comprises a singleton community. Two communities are merged into one if the resulting community has larger modularity value that both the initial ones. This is a fast and accurate algorithm which, detects all communities in the graph. In suffers however in the sense that it constantly, during its execution, requires the knowledge of a global information of the graph, namely the number of its edges (which change as the algorithm modifies the graph), limiting its distributed nature. The Infomap algorithm [4] transforms the problem of community detection into efficiently compressing the structure of the graph, so that one can recovered almost the entire structure from the compressed form. This is achieved by minimizing a function that expresses the tradeoff between compression factor and loss of information (difference between the original graph and the reconstructed graph).