Ranked k-medoids: A fast and accurate rank-based partitioning algorithm for clustering large datasets

Authors:
Seyed Mohammad Razavi Zadegan;Mehdi Mirzaie;Farahnaz Sadoughi
Affiliations:
Department of Health Information Management, School of Health Management and Information Sciences, Tehran University of Medical Sciences, Tehran, Iran;Proteomics Research Center, Faculty of Paramedical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran and Department of Bioinformatics, School of Computer Science, Institute fo ...;Department of Health Information Management, School of Health Management and Information Sciences, Tehran University of Medical Sciences, Tehran, Iran
Venue:
Knowledge-Based Systems
Year:
2013

Citing 25
Cited 1

BIRCH: an efficient data clustering method for very large databases

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications

Data Mining and Knowledge Discovery
Chameleon: Hierarchical Clustering Using Dynamic Modeling

Computer
CLARANS: A Method for Clustering Objects for Spatial Data Mining

IEEE Transactions on Knowledge and Data Engineering
STING: A Statistical Information Grid Approach to Spatial Data Mining

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques
Iterative shrinking method for clustering problems

Pattern Recognition
A hybridized approach to data clustering

Expert Systems with Applications: An International Journal
Knowledge-based image retrieval system

Knowledge-Based Systems
A simple and fast algorithm for K-medoids clustering

Expert Systems with Applications: An International Journal
External validation measures for K-means clustering: A data distribution perspective

Expert Systems with Applications: An International Journal
An efficient hybrid data clustering method based on K-harmonic means and Particle Swarm Optimization

Expert Systems with Applications: An International Journal
Data clustering and fuzzy neural network for sales forecasting: A case study in printed circuit board industry

Knowledge-Based Systems
Performance evaluation of density-based clustering methods

Information Sciences: an International Journal
Studying the feasibility of a recommender in a citizen web portal based on user modeling and clustering algorithms

Expert Systems with Applications: An International Journal
An artificial bee colony approach for clustering

Expert Systems with Applications: An International Journal
Particle Swarm Optimization and Intelligence: Advances and Applications

Particle Swarm Optimization and Intelligence: Advances and Applications
Nonlinear dimensionality reduction of gene expression data for visualization and clustering analysis of cancer tissue samples

Computers in Biology and Medicine
Ant clustering algorithm with K-harmonic means clustering

Expert Systems with Applications: An International Journal
Integration of genetic fuzzy systems and artificial neural networks for stock price forecasting

Knowledge-Based Systems
Supporting image retrieval framework with rule base system

Knowledge-Based Systems
Particle swarm optimization based K-means clustering approach for security assessment in power systems

Expert Systems with Applications: An International Journal
A new and efficient k-medoid algorithm for spatial clustering

ICCSA'05 Proceedings of the 2005 international conference on Computational Science and Its Applications - Volume Part III
Clustering of the self-organizing map

IEEE Transactions on Neural Networks
Survey of clustering algorithms

IEEE Transactions on Neural Networks

Spatial interaction - modification model and applications to geo-demographic analysis

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Clustering analysis is the process of dividing a set of objects into none-overlapping subsets. Each subset is a cluster, such that objects in the cluster are similar to one another and dissimilar to the objects in the other clusters. Most of the algorithms in partitioning approach of clustering suffer from trapping in local optimum and the sensitivity to initialization and outliers. In this paper, we introduce a novel partitioning algorithm that its initialization does not lead the algorithm to local optimum and can find all the Gaussian-shaped clusters if it has the right number of them. In this algorithm, the similarity between pairs of objects are computed once and updating the medoids in each iteration costs O(kxm) where k is the number of clusters and m is the number of objects needed to update medoids of the clusters. Comparison between our algorithm and two other partitioning algorithms is performed by using four well-known external validation measures over seven standard datasets. The results for the larger datasets show the superiority of the proposed algorithm over two other algorithms in terms of speed and accuracy.