Density-based clustering of uncertain data
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Efficient and effective server-sided distributed clustering
Proceedings of the 14th ACM international conference on Information and knowledge management
Effective and Efficient Distributed Model-Based Clustering
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
PENS: an algorithm for density-based clustering in peer-to-peer systems
InfoScale '06 Proceedings of the 1st international conference on Scalable information systems
ST-DBSCAN: An algorithm for clustering spatial-temporal data
Data & Knowledge Engineering
An effective algorithm for mining 3-clusters in vertically partitioned data
Proceedings of the 17th ACM conference on Information and knowledge management
A new approach for distributed density based clustering on grid platform
BNCOD'07 Proceedings of the 24th British national conference on Databases
Lightweight clustering technique for distributed data mining applications
ICDM'07 Proceedings of the 7th industrial conference on Advances in data mining: theoretical aspects and applications
Ensemble learning based distributed clustering
PAKDD'07 Proceedings of the 2007 international conference on Emerging technologies in knowledge discovery and data mining
Scalable clustering algorithm for N-body simulations in a shared-nothing cluster
SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Scalable local density-based distributed clustering
Expert Systems with Applications: An International Journal
Learning latent variable models from distributed and abstracted data
Information Sciences: an International Journal
Probabilistic similarity join on uncertain data
DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
Approximated clustering of distributed high-dimensional data
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Clustering distributed data streams in peer-to-peer environments
Information Sciences: an International Journal
ACM Transactions on Knowledge Discovery from Data (TKDD)
A framework for Multi-Agent Based Clustering
Autonomous Agents and Multi-Agent Systems
Distributed data mining patterns and services: an architecture and experiments
Concurrency and Computation: Practice & Experience
MR-DBSCAN: a scalable MapReduce-based DBSCAN algorithm for heavily skewed data
Frontiers of Computer Science: Selected Publications from Chinese Universities
Robust estimation of a global Gaussian mixture by decentralized aggregations of local models
Web Intelligence and Agent Systems
Fuzzy and crisp clustering methods based on the neighborhood concept: A comprehensive review
Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology - FUZZYSS'2011: 2nd International Fuzzy Systems Symposium
Hi-index | 0.00 |
Clustering has become an increasingly important task in analysing huge amounts of data. Traditional applications require that all data has to be located at the site where it is scrutinized. Nowadays, large amounts of heterogeneous, complex data reside on different, independently working computers which are connected to each other via local or wide area networks. In this paper, we propose a scalable density-based distributed clustering algorithm which allows a user-defined trade-off between clustering quality and the number of transmitted objects from the different local sites to a global server site. Our approach consists of the following steps: First, we order all objects located at a local site according to a quality criterion reflecting their suitability to serve as local representatives. Then we send the best of these representatives to a server site where they are clustered with a slightly enhanced density-based clustering algorithm. This approach is very efficient, because the local detemination of suitable representatives can be carried out quickly and independently from each other. Furthermore, based on the scalable number of the most suitable local representatives, the global clustering can be done very effectively and efficiently. In our experimental evaluation, we will show that our new scalable density-based distributed clustering approach results in high quality clusterings with scalable transmission cost.