Parallel algorithms for hierarchical clustering
Parallel Computing
BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
OPTICS: ordering points to identify the clustering structure
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
ACM Computing Surveys (CSUR)
Clustering Algorithms
Fast hierarchical clustering and its validation
Data & Knowledge Engineering
Incremental and effective data summarization for dynamic hierarchical clustering
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Fast agglomerative hierarchical clustering algorithm using Locality-Sensitive Hashing
Knowledge and Information Systems
Rough-DBSCAN: A fast hybrid density based clustering method for large data sets
Pattern Recognition Letters
Fast Single-Link Clustering Method Based on Tolerance Rough Set Model
RSFDGrC '09 Proceedings of the 12th International Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing
Speeding-Up hierarchical agglomerative clustering in presence of expensive metrics
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
High scent web page recommendations using fuzzy rough set attribute reduction
Transactions on rough sets XIV
Tolerance rough set theory based data summarization for clustering large datasets
Transactions on rough sets XIV
Efficient determination of binary non-negative vector neighbors with regard to cosine similarity
IEA/AIE'12 Proceedings of the 25th international conference on Industrial Engineering and Other Applications of Applied Intelligent Systems: advanced research in applied artificial intelligence
Hi-index | 0.00 |
Average-link (AL) is a distance based hierarchical clustering method, which is not sensitive to the noisy patterns. However, like all hierarchical clustering methods AL also needs to scan the dataset many times. AL has time and space complexity of O(n2), where n is the size of the dataset. These prohibit the use of AL for large datasets. In this paper, we have proposed a distance based hierarchical clustering method termed l-AL which speeds up the classical AL method in any metric (vector or non-vector) space. In this scheme, first leaders clustering method is applied to the dataset to derive a set of leaders and subsequently AL clustering is applied to the leaders. To speed-up the leaders clustering method, reduction in distance computations is also proposed in this paper. Experimental results confirm that the l-AL method is considerably faster than the classical AL method yet keeping clustering results at par with the classical AL method.