BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Bayesian classification (AutoClass): theory and results
Advances in knowledge discovery and data mining
CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
OPTICS: ordering points to identify the clustering structure
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Distributed data clustering can be efficient and exact
ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
Chord: A scalable peer-to-peer lookup service for internet applications
Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
A scalable content-addressable network
Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
RACHET: An Efficient Cover-Based Merging of Clustering Hierarchies from Distributed Datasets
Distributed and Parallel Databases - Special issue: Parallel and distributed data mining
A Fast Parallel Clustering Algorithm for Large Spatial Databases
Data Mining and Knowledge Discovery
WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
STING: A Statistical Information Grid Approach to Spatial Data Mining
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Semantic Small World: An Overlay Network for Peer-to-Peer Search
ICNP '04 Proceedings of the 12th IEEE International Conference on Network Protocols
Scalable density-based distributed clustering
PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
Guaranteeing correctness and availability in P2P range indices
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
A case study in building layered DHT applications
Proceedings of the 2005 conference on Applications, technologies, architectures, and protocols for computer communications
Clustering distributed data streams in peer-to-peer environments
Information Sciences: an International Journal
Preserving locality in MMVE applications based on ant clustering
VECIMS'09 Proceedings of the 2009 IEEE international conference on Virtual Environments, Human-Computer Interfaces and Measurement Systems
A new approach for distributed density based clustering on grid platform
BNCOD'07 Proceedings of the 24th British national conference on Databases
Distributed data clustering in multi-dimensional peer-to-peer networks
ADC '10 Proceedings of the Twenty-First Australasian Conference on Database Technologies - Volume 104
Design and evaluation of decentralized online clustering
ACM Transactions on Autonomous and Adaptive Systems (TAAS)
ASCCN: Arbitrary Shaped Clustering Method with Compatible Nucleoids
International Journal of Data Warehousing and Mining
Hi-index | 0.00 |
Huge amounts of data are available in large-scale networks of autonomous data sources dispersed over a wide area. Data mining is an essential technology for obtaining hidden and valuable knowledge from these networked data sources. In this paper, we investigate clustering, one of the most important data mining tasks, in one of such networked computing environments, i.e., peer-to-peer (P2P) systems. The lack of a central control and the sheer large size of P2P systems make the existing clustering techniques not applicable here. We propose a fully distributed clustering algorithm, called Peer dENsity-based cluStering (PENS), which overcomes the challenge raised in performing clustering in peer-to-peer environments, i.e., cluster assembly. The main idea of PENS is hierarchical cluster assembly, which enables peers to collaborate in forming a global clustering model without requiring a central control or message flooding. The complexity analysis of the algorithm demonstrates that PENS can discover clusters and noise efficiently in P2P systems.