Is there a best quality metric for graph clusters?

Authors:
Hélio Almeida;Dorgival Guedes;Wagner Meira;Mohammed J. Zaki
Affiliations:
Universidade Federal de Minas Gerais, MG, Brazil;Universidade Federal de Minas Gerais, MG, Brazil;Universidade Federal de Minas Gerais, MG, Brazil;Rensselaer Polytechnic Institute, NY
Venue:
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I
Year:
2011

Citing 12
Cited 1

Trawling the Web for emerging cyber-communities

WWW '99 Proceedings of the eighth international conference on World Wide Web
Normalized Cuts and Image Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Selecting the right interestingness measure for association patterns

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
On clusterings: Good, bad and spectral

Journal of the ACM (JACM)
Introduction to Data Mining, (First Edition)

Introduction to Data Mining, (First Edition)
Engineering graph clustering: Models and experimental evaluation

Journal of Experimental Algorithmics (JEA)
Weighted Graph Cuts without Eigenvectors A Multilevel Approach

IEEE Transactions on Pattern Analysis and Machine Intelligence
Statistical properties of community structure in large social and information networks

Proceedings of the 17th international conference on World Wide Web
Graph Clustering Via a Discrete Uncoupling Process

SIAM Journal on Matrix Analysis and Applications
Graph clustering based on structural/attribute similarities

Proceedings of the VLDB Endowment
Empirical comparison of algorithms for network community detection

Proceedings of the 19th international conference on World wide web
Survey: Graph clustering

Computer Science Review

An evaluation of community detection algorithms on large-scale email traffic

SEA'12 Proceedings of the 11th international conference on Experimental Algorithms

Quantified Score

Hi-index	0.00

Visualization

Abstract

Graph clustering, the process of discovering groups of similar vertices in a graph, is a very interesting area of study, with applications in many different scenarios. One of the most important aspects of graph clustering is the evaluation of cluster quality, which is important not only to measure the effectiveness of clustering algorithms, but also to give insights on the dynamics of relationships in a given network. Many quality evaluation metrics for graph clustering have been proposed in the literature, but there is no consensus on how do they compare to each other and how well they perform on different kinds of graphs. In this work we study five major graph clustering quality metrics in terms of their formal biases and their behavior when applied to clusters found by four implementations of classic graph clustering algorithms on five large, real world graphs. Our results show that those popular quality metrics have strong biases toward incorrectly awarding good scores to some kinds of clusters, especially seen in larger networks. They also indicate that currently used clustering algorithms and quality metrics do not behave as expected when cluster structures are different from the more traditional, clique-like ones.