An algorithm for discovering clusters of different densities or shapes in noisy data sets

  • Authors:
  • Fereshte Khani;Mohmmad Javad Hosseini;Ahmad Ali Abin;Hamid Beigy

  • Affiliations:
  • Sharif University of Technology, Iran;Sharif University of Technology, Iran;Sharif University of Technology, Iran;Sharif University of Technology, Iran

  • Venue:
  • Proceedings of the 28th Annual ACM Symposium on Applied Computing
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

In clustering spatial data, we are given a set of points in Rn and the objective is to find the clusters (representing spatial objects) in the set of points. Finding clusters with different shapes, sizes, and densities in data with noise and potentially outliers is a challenging task. This problem is especially studied in machine learning community and has lots of applications. We present a novel clustering technique, which can solve mentioned issues considerably. In the proposed algorithm, we let the structure of the data set itself find the clusters, this is done by having points actively send and receive feedbacks to each other. The idea of the proposed method is to transform the input data set into a graph by adding edges between points that belong to the same cluster, so as connected components correspond to clusters, whereas points in different clusters are almost disconnected. At the start, our algorithm creates a preliminary graph and tries to improve it iteratively. In order to build the graph (add more edges), each point sends feedback to its neighborhood points. The neighborhoods and the feedback to be sent are determined by investigating the received feedbacks. This process continues until a stable graph is created. Henceforth, the clusters are formed by post-processing the constructed graph. Our algorithm is intuitive, easy to state and analyze, and does not need to have lots of parameter tuning. Experimental results show that our proposed algorithm outperforms existing related methods in this area.