Using MPI: portable parallel programming with the message-passing interface
Using MPI: portable parallel programming with the message-passing interface
BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Computer Vision
Parallelism speeds data mining
IEEE Parallel & Distributed Technology: Systems & Technology
Strategies for Parallel Data Mining
IEEE Concurrency
Wavelets for Computer Graphics: A Primer, Part 1
IEEE Computer Graphics and Applications
CLARANS: A Method for Clustering Objects for Spatial Data Mining
IEEE Transactions on Knowledge and Data Engineering
On distributing the clustering process
Pattern Recognition Letters
Experiments in Parallel Clustering with DBSCAN
Euro-Par '01 Proceedings of the 7th International Euro-Par Conference Manchester on Parallel Processing
STING: A Statistical Information Grid Approach to Spatial Data Mining
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
WaveCluster: a wavelet-based clustering approach for spatial data in very large databases
The VLDB Journal — The International Journal on Very Large Data Bases
Biclustering Algorithms for Biological Data Analysis: A Survey
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
A hybrid unsupervised approach for document clustering
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Wavelet-based phase classification
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
ICAPR '09 Proceedings of the 2009 Seventh International Conference on Advances in Pattern Recognition
Hi-index | 0.00 |
A linear scaling parallel clustering algorithm implementation and its application to very large datasets for cluster analysis is reported. WaveCluster is a novel clustering approach based on wavelet transforms. Despite this approach has an ability to detect clusters of arbitrary shapes in an efficient way, it requires considerable amount of time to collect results for large sizes of multi-dimensional datasets. We propose the parallel implementation of the WaveCluster algorithm based on the message passing model for a distributed-memory multiprocessor system. In the proposed method, communication among processors and memory requirements are kept at minimum to achieve high efficiency. We have conducted the experiments on a dense dataset and a sparse dataset to measure the algorithm behavior appropriately. Our results obtained from performed experiments demonstrate that developed parallel WaveCluster algorithm exposes high speedup and scales linearly with the increasing number of processors.