Benchmarking graph-based clustering algorithms

Authors:
P. Foggia;G. Percannella;C. Sansone;M. Vento
Affiliations:
Dipartimento di Informatica e Sistemistica, Universití di Napoli Federico II, Via Claudio, 21 I-80125 Napoli, Italy;Dipartimento di Ingegneria dell'Informazione ed Ingegneria Elettrica, Universití di Salerno, Via P.te Don Melillo, I-84084 Fisciano (SA), Italy;Dipartimento di Informatica e Sistemistica, Universití di Napoli Federico II, Via Claudio, 21 I-80125 Napoli, Italy;Dipartimento di Ingegneria dell'Informazione ed Ingegneria Elettrica, Universití di Salerno, Via P.te Don Melillo, I-84084 Fisciano (SA), Italy
Venue:
Image and Vision Computing
Year:
2009

Citing 17
Cited 1

Algorithms for clustering data

Algorithms for clustering data
A Validity Measure for Fuzzy Clustering

IEEE Transactions on Pattern Analysis and Machine Intelligence
Self-organizing maps

Self-organizing maps
Data clustering: a review

ACM Computing Surveys (CSUR)
Normalized Cuts and Image Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Pattern Recognition with Fuzzy Objective Function Algorithms

Pattern Recognition with Fuzzy Objective Function Algorithms
Fundamentals of Computer Alori

Fundamentals of Computer Alori
An Optimal Graph Theoretic Approach to Data Clustering: Theory and Its Application to Image Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Performance Evaluation of Some Clustering Algorithms and Validity Indices

IEEE Transactions on Pattern Analysis and Machine Intelligence
A large database of graphs and its use for benchmarking graph isomorphism algorithms

Pattern Recognition Letters - Special issue: Graph-based representations in pattern recognition
Validation indices for graph clustering

Pattern Recognition Letters - Special issue: Graph-based representations in pattern recognition
On clusterings-good, bad and spectral

FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Combining experts for anchorperson shot detection in news videos

Pattern Analysis & Applications
Graph-Theoretical Methods for Detecting and Describing Gestalt Clusters

IEEE Transactions on Computers
Rapid and brief communication: Evaluation of the performance of clustering algorithms in kernel-induced feature space

Pattern Recognition
Assessing the performance of a graph-based clustering algorithm

GbRPR'07 Proceedings of the 6th IAPR-TC-15 international conference on Graph-based representations in pattern recognition
A Cluster Separation Measure

IEEE Transactions on Pattern Analysis and Machine Intelligence

Clustering NGN user behavior for anomaly detection

Information Security Tech. Report

Quantified Score

Hi-index	0.00

Visualization

Abstract

Among all the different clustering approaches proposed so far, graph-based algorithms are particularly suited for dealing with data that does not come from a Gaussian or a spherical distribution. They can be used for detecting clusters of any size and shape without the need of specifying the actual number of clusters; moreover, they can be profitably used in cluster detection problems. Despite of the fact that graph-based methods are gaining more and more popularity in different scientific areas, the choice of an appropriate algorithm for a given application is still the most crucial task. In this paper, we then present a detailed performance evaluation of five different graph-based clustering approaches on a database of synthetically generated graphs. The main findings of such an analysis were that algorithms based on the Minimum Spanning Tree perform better than other approaches. Four of the algorithms selected for comparison have been chosen from the open literature. While these algorithms do not require the setting of the number of clusters, they need, however, some parameters to be provided by the user. So, as the fifth algorithm under comparison, we propose an approach that overcomes this limitation, proving to be an effective solution in real applications where a completely unsupervised method for cluster detection is desirable. This was confirmed by a further comparative analysis carried out on four datasets coming from the UCI Machine Learning Repository.