Accelerating data-intensive science with Gordon and Dash
Proceedings of the 2010 TeraGrid Conference
DiscFinder: a data-intensive scalable cluster finder for astrophysics
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
A new scalable parallel DBSCAN algorithm using the disjoint-set data structure
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Scalable parallel OPTICS data clustering using graph algorithmic techniques
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Hi-index | 0.00 |
Clustering, or unsupervised classification, has many uses in fields that depend on grouping results from large amount of data, an example being the N-body cosmological simulation in astrophysics. In this paper, we study a particular clustering algorithm used in astrophysics, called HOP, and present a parallel implementation to speed up its current sequential implementation. Our approach first builds in parallel the spatial domain hierarchical data structure, a three-dimensional KD tree. Using a KD tree, the core of the HOP algorithm that searches for the highest density neighbor can be performed using only subsets of the particles and hence the communication cost is reduced. We evaluate our implementation by using data sets from a production cosmological application. The experimental results demonstrate up to 24 脳 speedup using 64 processors on three parallel processing machines.