Ranked k-medoids: A fast and accurate rank-based partitioning algorithm for clustering large datasets

  • Authors:
  • Seyed Mohammad Razavi Zadegan;Mehdi Mirzaie;Farahnaz Sadoughi

  • Affiliations:
  • Department of Health Information Management, School of Health Management and Information Sciences, Tehran University of Medical Sciences, Tehran, Iran;Proteomics Research Center, Faculty of Paramedical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran and Department of Bioinformatics, School of Computer Science, Institute fo ...;Department of Health Information Management, School of Health Management and Information Sciences, Tehran University of Medical Sciences, Tehran, Iran

  • Venue:
  • Knowledge-Based Systems
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Clustering analysis is the process of dividing a set of objects into none-overlapping subsets. Each subset is a cluster, such that objects in the cluster are similar to one another and dissimilar to the objects in the other clusters. Most of the algorithms in partitioning approach of clustering suffer from trapping in local optimum and the sensitivity to initialization and outliers. In this paper, we introduce a novel partitioning algorithm that its initialization does not lead the algorithm to local optimum and can find all the Gaussian-shaped clusters if it has the right number of them. In this algorithm, the similarity between pairs of objects are computed once and updating the medoids in each iteration costs O(kxm) where k is the number of clusters and m is the number of objects needed to update medoids of the clusters. Comparison between our algorithm and two other partitioning algorithms is performed by using four well-known external validation measures over seven standard datasets. The results for the larger datasets show the superiority of the proposed algorithm over two other algorithms in terms of speed and accuracy.