Efficiency of hierarchic agglomerative clustering using the ICL distributed array processor
Journal of Documentation
Fractals for secondary key retrieval
PODS '89 Proceedings of the eighth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
The R*-tree: an efficient and robust access method for points and rectangles
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
An introduction to parallel algorithms
An introduction to parallel algorithms
The SEQUOIA 2000 storage benchmark
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
CIKM '93 Proceedings of the second international conference on Information and knowledge management
PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing
PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing
Parallel algorithms for hierarchical clustering
Parallel Computing
An effective hash-based algorithm for mining association rules
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
A fast distributed algorithm for mining association rules
DIS '96 Proceedings of the fourth international conference on on Parallel and distributed information systems
Halo World: Tools for Parallel Cluster Finding inAstrophysical N-body Simulations
Data Mining and Knowledge Discovery
Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications
Data Mining and Knowledge Discovery
An introduction to spatial database systems
The VLDB Journal — The International Journal on Very Large Data Bases - Spatial Database Systems
Systems for Knowledge Discovery in Databases
IEEE Transactions on Knowledge and Data Engineering
A Distribution-Based Clustering Algorithm for Mining in Large Spatial Databases
ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
The X-tree: An Index Structure for High-Dimensional Data
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Data placement in shared-nothing parallel database systems
The VLDB Journal — The International Journal on Very Large Data Bases
Remote Sensing Digital Image Analysis: An Introduction
Remote Sensing Digital Image Analysis: An Introduction
High performance data mining (tutorial PM-3)
Tutorial notes of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
On distributing the clustering process
Pattern Recognition Letters
Effective and Efficient Distributed Model-Based Clustering
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Data Mining and Knowledge Discovery
PENS: an algorithm for density-based clustering in peer-to-peer systems
InfoScale '06 Proceedings of the 1st international conference on Scalable information systems
Merging distributed database summaries
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
International Journal of Data Mining and Bioinformatics
Data weaving: scaling up the state-of-the-art in data clustering
Proceedings of the 17th ACM conference on Information and knowledge management
Density-based clustering using graphics processors
Proceedings of the 18th ACM conference on Information and knowledge management
An efficient clustering algorithm for large-scale topical web pages
Proceedings of the 18th ACM conference on Information and knowledge management
Parallel K-Means Clustering Based on MapReduce
CloudCom '09 Proceedings of the 1st International Conference on Cloud Computing
A new approach for distributed density based clustering on grid platform
BNCOD'07 Proceedings of the 24th British national conference on Databases
Lightweight clustering technique for distributed data mining applications
ICDM'07 Proceedings of the 7th industrial conference on Advances in data mining: theoretical aspects and applications
Towards personal high-performance geospatial computing (HPC-G): perspectives and a case study
Proceedings of the ACM SIGSPATIAL International Workshop on High Performance and Distributed Geographic Information Systems
Scalable clustering algorithm for N-body simulations in a shared-nothing cluster
SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
BotTrack: tracking botnets using NetFlow and PageRank
NETWORKING'11 Proceedings of the 10th international IFIP TC 6 conference on Networking - Volume Part I
DisClus: a distributed clustering technique over high resolution satellite data
ICDCN'10 Proceedings of the 11th international conference on Distributed computing and networking
A parallel method for computing rough set approximations
Information Sciences: an International Journal
Parallel rough set based knowledge acquisition using MapReduce from big data
Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications
Parallel decision tree with application to water quality data analysis
ISNN'12 Proceedings of the 9th international conference on Advances in Neural Networks - Volume Part II
A new scalable parallel DBSCAN algorithm using the disjoint-set data structure
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
p-PIC: Parallel power iteration clustering for big data
Journal of Parallel and Distributed Computing
Scalable parallel OPTICS data clustering using graph algorithmic techniques
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Mr. Scan: extreme scale density-based clustering using a tree-based network of GPGPU nodes
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
MR-DBSCAN: a scalable MapReduce-based DBSCAN algorithm for heavily skewed data
Frontiers of Computer Science: Selected Publications from Chinese Universities
International Journal of Approximate Reasoning
Hi-index | 0.00 |
The clustering algorithm DBSCAN relies on a density-basednotion of clusters and is designed to discover clusters ofarbitrary shape as well as to distinguish noise. In this paper, wepresent PDBSCAN, a parallel version of this algorithm. We use the ‘shared-nothing’ architecture with multiple computers interconnectedthrough a network. A fundamental component of a shared-nothing systemis its distributed data structure. We introduce the dR*-tree, a distributed spatial index structure in which the data is spread amongmultiple computers and the indexes of the data are replicated onevery computer. We implemented our method using a number ofworkstations connected via Ethernet (10 Mbit). A performanceevaluation shows that PDBSCAN offers nearly linear speedup and hasexcellent scaleup and sizeup behavior.