A Fast Parallel Clustering Algorithm for Large Spatial Databases
Data Mining and Knowledge Discovery
On distributing the clustering process
Pattern Recognition Letters
The new k-windows algorithm for improving the k-means clustering algorithm
Journal of Complexity
Hi-index | 0.00 |
Cosmological N-body simulations on parallel computers produce large datasets—gigabytesat each instant of simulated cosmological time,and hundreds of gigabytes over the course of a simulation.These large datasets require further analysis before they can be compared to astronomical observations. The “Halo World” tools includetwo methods for performing halo finding:identifying all of the gravitationally stable clustersin a point-sampled density field.One of these methods is a parallel implementation ofthe friends of friends (FOF) algorithm, widely used in the field of N-body cosmology.The new IsoDen method based on isodensity surfaceshas been developed to overcome some of the shortcomings of FOF.Parallel processing is the only viable way of obtainingthe necessary performance and storage capacity to carry out these analysis tasks. Ultimately, we must also plan to use disk storage as the only economically viablealternative for storing and manipulating such large data sets.Both IsoDen and friends of friends have been implemented on a variety of computer systems, with parallelismup to 512 processors,and successfully used to extract halos from simulations with up to 16.8 million particles.