Improvement of the k-means clustering filtering algorithm

  • Authors:
  • Jim Z. C. Lai;Yi-Ching Liaw

  • Affiliations:
  • Department of Computer Science, National Taiwan Ocean University, Keelung 202, Taiwan, ROC;Department of Computer Science and Engineering, Nanhua University, Chiayi 622, Taiwan, ROC

  • Venue:
  • Pattern Recognition
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

In this paper, we present a modified filtering algorithm (MFA) by making use of center variations to speed up clustering process. Our method first divides clusters into static and active groups. We use the information of cluster displacements to reject unlikely cluster centers for all nodes in the kd-tree. We reduce the computational complexity of filtering algorithm (FA) through finding candidates for each node mainly from the set of active cluster centers. Two conditions for determining the set of candidate cluster centers for each node from active clusters are developed. Our approach is different from the major available algorithm, which passes no information from one stage of iteration to the next. Theoretical analysis shows that our method can reduce the computational complexity, in terms of the number of distance calculations, of FA at each stage of iteration by a factor of FC/AC, where FC and AC are the numbers of total clusters and active clusters, respectively. Compared with the FA, our algorithm can effectively reduce the computing time and number of distance calculations. It is noted that our proposed algorithm can generate the same clusters as that produced by hard k-means clustering. The superiority of our method is more remarkable when a larger data set with higher dimension is used.