Fast agglomerative hierarchical clustering algorithm using Locality-Sensitive Hashing
Knowledge and Information Systems
Cluster aggregate inequality and multi-level hierarchical clustering
PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
Hi-index | 0.00 |
As the tractable amount of data is growing in computer science area, fast clustering algorithm is being required because traditional clustering algorithms are not so feasible for very large and high dimensional data. Many studies have been reported for clustering of large database, but most of them circumvent this problem by using the approximation method to result in thedeterioration of accuracy. In this paper, we propose a new clustering algorithm by means of partial maximum array, which can realize the agglomerative hierarchical clustering with the same accuracy to the brute-force algorithm and has O(N 2 ) time complexity. And we alsopresent the incremental method of similarity computation which substitutes the scalar calculation for the time-consuming calculation of vector similarity. The experimental results show that clustering becomes significantly fast for large and high dimensional data.