An agglomerative clustering algorithm using a dynamic k-nearest-neighbor list

  • Authors:
  • Jim Z. C. Lai;Tsung-Jen Huang

  • Affiliations:
  • Department of Computer Science and Engineering, National Taiwan Ocean University, Keelung 202, Taiwan, ROC;Department of Computer Science and Engineering, National Taiwan Ocean University, Keelung 202, Taiwan, ROC and Department of Information and Communications Research Laboratories, Industrial Techno ...

  • Venue:
  • Information Sciences: an International Journal
  • Year:
  • 2011

Quantified Score

Hi-index 0.07

Visualization

Abstract

In this paper, a new algorithm is developed to reduce the computational complexity of Ward's method. The proposed approach uses a dynamic k-nearest-neighbor list to avoid the determination of a cluster's nearest neighbor at some steps of the cluster merge. Double linked algorithm (DLA) can significantly reduce the computing time of the fast pairwise nearest neighbor (FPNN) algorithm by obtaining an approximate solution of hierarchical agglomerative clustering. In this paper, we propose a method to resolve the problem of a non-optimal solution for DLA while keeping the corresponding advantage of low computational complexity. The computational complexity of the proposed method DKNNA+FS (dynamic k-nearest-neighbor algorithm with a fast search) in terms of the number of distance calculations is O(N^2), where N is the number of data points. Compared to FPNN with a fast search (FPNN+FS), the proposed method using the same fast search algorithm (DKNNA+FS) can reduce the computing time by a factor of 1.90-2.18 for the data set from a real image. In comparison with FPNN+FS, DKNNA+FS can reduce the computing time by a factor of 1.92-2.02 using the data set generated from three images. Compared to DLA with a fast search (DLA+FS), DKNNA+FS can decrease the average mean square error by 1.26% for the same data set.