Parallel k/h-Means Clustering for Large Data Sets
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
A fuzzy k-modes algorithm for clustering categorical data
IEEE Transactions on Fuzzy Systems
Hi-index | 0.00 |
Clustering is a fundamental and important technique in image processing, pattern recorgnition, data compression, etc. However, most recent clustering algorithms cannot deal with large, complex databases and do not always achieve high clustering results. This paper proposes a parallel clustering algorithm for categorical and mixed data which can overcome the above problems. Our contributions are: (1) improving the k-sets algorithm [3] to achieve highly accurate clustering results; and (2) applying parallel techniques to the improved approach to achieve a parallel algorithm. Experiments on a CRAY T3E show that the proposed algorithm can achieve higher accuracy than previous attempts and can reduce processing time; thus, it is practical for use with very large and complex databases.