A Parameterless Method for Efficiently Discovering Clusters of Arbitrary Shape in Large Datasets

  • Authors:
  • Andrew Foss;Osmar R. Zaïane

  • Affiliations:
  • -;-

  • Venue:
  • ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Clustering is the problem of grouping data based on similarityand consists of maximizing the intra-group similaritywhile minimizing the inter-group similarity. The problem ofclustering data sets is also known as unsupervised classification,since no class labels are given. However, all exist-ingclustering algorithms require some parameters to steerthe clustering process, such as the famous k for the numberof expected clusters, which constitutes a supervision ofa sort. We present in this paper a new, efficient, fast andscalable clustering algorithm that clusters over a range ofresolutions and finds a potential optimum clustering withoutrequiring any parameter input. Our experiments showthat our algorithm outperforms most existing clustering algorithmsin quality and speed for large data sets.