A novel typical-sample-weighted clustering algorithm for large data sets

Authors:
Jie Li;Xinbo Gao;Licheng Jiao
Affiliations:
School of Electronic Engineering, Xidian University, Xi’an, China;School of Electronic Engineering, Xidian University, Xi’an, China;School of Electronic Engineering, Xidian University, Xi’an, China
Venue:
CIS'05 Proceedings of the 2005 international conference on Computational Intelligence and Security - Volume Part I
Year:
2005

Citing 0
Cited 2

Locality sensitive C-means clustering algorithms

Neurocomputing
Sample-weighted clustering methods

Computers & Mathematics with Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the field of cluster analysis, most of existing algorithms are developed for small data sets, which cannot effectively process the large data sets encountered in data mining. Moreover, most clustering algorithms consider the contribution of each sample for classification uniformly. In fact, different samples should be of different contribution for clustering result. For this purpose, a novel typical-sample-weighted clustering algorithm is proposed for large data sets. By the atom clustering, the new algorithm extracts the typical samples to reduce the data amount. Then the extracted samples are weighted by their corresponding typicality and then clustered by the classical fuzzy c-means (FCM) algorithm. Finally, the Mahalanobis distance is employed to classify each original sample into obtained clusters. It is obvious that the novel algorithm can improve the speed and robustness of the traditional FCM algorithm. The experimental results with various test data sets illustrate the effectiveness of the proposed clustering algorithm.