Aggregation pheromone density based data clustering

  • Authors:
  • Ashish Ghosh;Anindya Halder;Megha Kothari;Susmita Ghosh

  • Affiliations:
  • Machine Intelligence Unit and Center for Soft Computing Research, Indian Statistical Institute, Kolkata, India;Center for Soft Computing Research, Indian Statistical Institute, Kolkata, India;Department of Computer Science and Engineering, Jadavpur University, Kolkata, India;Department of Computer Science and Engineering, Jadavpur University, Kolkata, India

  • Venue:
  • Information Sciences: an International Journal
  • Year:
  • 2008

Quantified Score

Hi-index 0.07

Visualization

Abstract

Ants, bees and other social insects deposit pheromone (a type of chemical) in order to communicate between the members of their community. Pheromone, that causes clumping or clustering behavior in a species and brings individuals into a closer proximity, is called aggregation pheromone. This article presents a new algorithm (called, APC) for clustering data sets based on this property of aggregation pheromone found in ants. An ant is placed at each location of a data point, and the ants are allowed to move in the search space to find points with higher pheromone density. The movement of an ant is governed by the amount of pheromone deposited at different points of the search space. More the deposited pheromone, more is the aggregation of ants. This leads to the formation of homogenous groups of data. The proposed algorithm is evaluated on a number of well-known benchmark data sets using different cluster validity measures. Results are compared with those obtained using two popular standard clustering techniques namely average linkage agglomerative and k-means clustering algorithm and with an ant-based method called adaptive time-dependent transporter ants for clustering (ATTA-C). Experimental results justify the potentiality of the proposed APC algorithm both in terms of the solution (clustering) quality as well as execution time compared to other algorithms for a large number of data sets.