Computation of initial modes for K-modes clustering algorithm using evidence accumulation

Authors:
Shehroz S. Khan;Shri Kant
Affiliations:
National University of Ireland Galway, Department of Information Technology, Galway, Republic of Ireland;Scientific Analysis Group, Defence R&D Organization, Delhi, India
Venue:
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Year:
2007

Citing 16
Cited 5

Algorithms for clustering data

Algorithms for clustering data
Symbolic clustering using a new dissimilarity measure

Pattern Recognition
An automatic and stable clustering algorithm

Pattern Recognition Letters
A conceptual version of the K-means algorithm

Pattern Recognition Letters
CURE: an efficient clustering algorithm for large databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Pattern Recognition and Image Preprocessing

Pattern Recognition and Image Preprocessing
Machine Learning and Data Mining; Methods and Applications

Machine Learning and Data Mining; Methods and Applications
Density-Based Multiscale Data Condensation

IEEE Transactions on Pattern Analysis and Machine Intelligence
An iterative initial-points refinement algorithm for categorical data clustering

Pattern Recognition Letters
Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values

Data Mining and Knowledge Discovery
Refining Initial Points for K-Means Clustering

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Evidence Accumulation Clustering Based on the K-Means Algorithm

Proceedings of the Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
Data Clustering Using Evidence Accumulation

ICPR '02 Proceedings of the 16 th International Conference on Pattern Recognition (ICPR'02) Volume 4 - Volume 4
Combining Multiple Weak Clusterings

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Cluster center initialization algorithm for K-means clustering

Pattern Recognition Letters
Combining Multiple Clusterings Using Evidence Accumulation

IEEE Transactions on Pattern Analysis and Machine Intelligence

Isolating top-k dense regions with filtration of sparse background

Pattern Recognition Letters
Weighted topological clustering for categorical data

ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part I
A bio inspired fuzzy k-modes clustring algorithm

ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part III
Categorical-and-numerical-attribute data clustering based on a unified similarity metric without knowing cluster number

Pattern Recognition
Automated parameter estimation process for clustering algorithms used in software maintenance

Proceedings of the 51st ACM Southeast Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

Clustering accuracy of partitional clustering algorithm for categorical data primarily depends upon the choice of initial data points (modes) to instigate the clustering process. Traditionally initial modes are chosen randomly. As a consequence of that, the clustering results cannot be generated and repeated consistently. In this paper we present an approach to compute initial modes for K-mode clustering algorithm to cluster categorical data sets. Here, we utilize the idea of Evidence Accumulation for combining the results of multiple clusterings. Initially, n F - dimensional data is decomposed into a large number of compact clusters; the K-modes algorithm performs this decomposition, with several clusterings obtained by N random initializations of the K- modes algorithm. The modes thus obtained from every run of random initializations are stored in a Mode-Pool, PN. The objective is to investigate the contribution of those data objects/patterns that are less vulnerable to the choice of random selection of modes and to choose the most diverse set of modes from the available Mode-Pool that can be utilized as initial modes for the K-mode clustering algorithm. Experimentally we found that by this method we get initial modes that are very similar to the actual/desired modes and gives consistent and better clustering results with less variance of clustering error than the traditional method of choosing random modes.