MOSAIC: a proximity graph approach for agglomerative clustering

  • Authors:
  • Jiyeon Choo;Rachsuda Jiamthapthaksin;Chun-sheng Chen;Oner Ulvi Celepcikay;Christian Giusti;Christoph F. Eick

  • Affiliations:
  • Computer Science Department, University of Houston, Houston, TX;Computer Science Department, University of Houston, Houston, TX;Computer Science Department, University of Houston, Houston, TX;Computer Science Department, University of Houston, Houston, TX;Department of Mathematics and Computer Science, University of Udine, Udine, Italy;Computer Science Department, University of Houston, Houston, TX

  • Venue:
  • DaWaK'07 Proceedings of the 9th international conference on Data Warehousing and Knowledge Discovery
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Representative-based clustering algorithms are quite popular due to their relative high speed and because of their sound theoretical foundation. On the other hand, the clusters they can obtain are limited to convex shapes and clustering results are also highly sensitive to initializations. In this paper, a novel agglomerative clustering algorithm called MOSAIC is proposed which greedily merges neighboring clusters maximizing a given fitness function. MOSAIC uses Gabriel graphs to determine which clusters are neighboring and approximates non-convex shapes as the unions of small clusters that have been computed using a representative-based clustering algorithm. The experimental results show that this technique leads to clusters of higher quality compared to running a representative clustering algorithm standalone. Given a suitable fitness function, MOSAIC is able to detect arbitrary shape clusters. In addition, MOSAIC is capable of dealing with high dimensional data.