Scalable local density-based distributed clustering

Authors:
Yan-Bing Liu;Zhang-Xiong Liu
Affiliations:
School of Computer Science, Chongqing University of Posts and Telecommunications, Chongqing 400065, PR China;School of Computer Science, Chongqing University of Posts and Telecommunications, Chongqing 400065, PR China
Venue:
Expert Systems with Applications: An International Journal
Year:
2011

Citing 5
Cited 1

LOF: identifying density-based local outliers

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
When Is ''Nearest Neighbor'' Meaningful?

ICDT '99 Proceedings of the 7th International Conference on Database Theory
Scalable density-based distributed clustering

PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
A survey of evolutionary algorithms for clustering

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Automatic Clustering Using an Improved Differential Evolution Algorithm

IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans

GoSCAN: Decentralized scalable data clustering

Computing

Quantified Score

Hi-index	12.05

Visualization

Abstract

Large amounts of high-dimensional data are distributed with the application of networks. Distributed clustering has become an increasingly important task due to variety of real-life constrains, including bandwidth and security aspects. Many distributed clustering algorithm have been proposed, but most of them have high transmission cost and poor clustering quality. In this paper, we propose a scalable local density-based distributed clustering algorithm which can easily fit high-dimensional data sets by this method such as density attractor distance and noise factor. In order to keep a lower transmission cost, we determine suitably low factor noises to send to the server. Furthermore, Test data sets, CMC data sets and KDD-CUP-99 are used for experimental evaluation to validate the performance practically. The experimental results and theoretical analysis show that the efficiency and quality for clustering of the proposed algorithm are superior to the other distributed clustering algorithm.