An agglomerative clustering algorithm using a dynamic k-nearest-neighbor list

Authors:
Jim Z. C. Lai;Tsung-Jen Huang
Affiliations:
Department of Computer Science and Engineering, National Taiwan Ocean University, Keelung 202, Taiwan, ROC;Department of Computer Science and Engineering, National Taiwan Ocean University, Keelung 202, Taiwan, ROC and Department of Information and Communications Research Laboratories, Industrial Techno ...
Venue:
Information Sciences: an International Journal
Year:
2011

Citing 17
Cited 4

An efficient agglomerative clustering algorithm using a heap

Pattern Recognition
Vector quantization and signal compression

Vector quantization and signal compression
Advances in knowledge discovery and data mining

Advances in knowledge discovery and data mining
Artifact reduction of JPEG coded images using mean-removed classified vector quantization

Signal Processing
An Efficient k-Means Clustering Algorithm: Analysis and Implementation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Web mining for web personalization

ACM Transactions on Internet Technology (TOIT)
Fast Agglomerative Clustering Using a k-Nearest Neighbor Graph

IEEE Transactions on Pattern Analysis and Machine Intelligence
Fast k-nearest-neighbor search based on projection and triangular inequality

Pattern Recognition
A tabu search approach for the minimum sum-of-squares clustering problem

Information Sciences: an International Journal
Clustering high dimensional data: A graph-based relaxed optimization approach

Information Sciences: an International Journal
Improvement of the fast exact pairwise-nearest-neighbor algorithm

Pattern Recognition
Performance evaluation of density-based clustering methods

Information Sciences: an International Journal
Pairwise-adaptive dissimilarity measure for document clustering

Information Sciences: an International Journal
On the computational complexity of the LBG and PNN algorithms

IEEE Transactions on Image Processing
Fast and memory efficient implementation of the exact PNN

IEEE Transactions on Image Processing
A fast exact GLA based on code vector activity detection

IEEE Transactions on Image Processing
Fast-searching algorithm for vector quantization using projection and triangular inequality

IEEE Transactions on Image Processing

Minimum spanning tree based split-and-merge: A hierarchical clustering method

Information Sciences: an International Journal
Learning data structure from classes: A case study applied to population genetics

Information Sciences: an International Journal
Fuzzy partition based soft subspace clustering and its applications in high dimensional data

Information Sciences: an International Journal
Analysing microarray expression data through effective clustering

Information Sciences: an International Journal

Quantified Score

Hi-index	0.07

Visualization

Abstract

In this paper, a new algorithm is developed to reduce the computational complexity of Ward's method. The proposed approach uses a dynamic k-nearest-neighbor list to avoid the determination of a cluster's nearest neighbor at some steps of the cluster merge. Double linked algorithm (DLA) can significantly reduce the computing time of the fast pairwise nearest neighbor (FPNN) algorithm by obtaining an approximate solution of hierarchical agglomerative clustering. In this paper, we propose a method to resolve the problem of a non-optimal solution for DLA while keeping the corresponding advantage of low computational complexity. The computational complexity of the proposed method DKNNA+FS (dynamic k-nearest-neighbor algorithm with a fast search) in terms of the number of distance calculations is O(N^2), where N is the number of data points. Compared to FPNN with a fast search (FPNN+FS), the proposed method using the same fast search algorithm (DKNNA+FS) can reduce the computing time by a factor of 1.90-2.18 for the data set from a real image. In comparison with FPNN+FS, DKNNA+FS can reduce the computing time by a factor of 1.92-2.02 using the data set generated from three images. Compared to DLA with a fast search (DLA+FS), DKNNA+FS can decrease the average mean square error by 1.26% for the same data set.