Parallel algorithms for hierarchical clustering
Parallel Computing
Bayesian classification (AutoClass): theory and results
Advances in knowledge discovery and data mining
Mining Very Large Databases with Parallel Processing
Mining Very Large Databases with Parallel Processing
Bayesian Classification of Protein Structure
IEEE Expert: Intelligent Systems and Their Applications
Parallel k/h-Means Clustering for Large Data Sets
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Large-Scale Parallel Data Clustering
ICPR '96 Proceedings of the International Conference on Pattern Recognition (ICPR '96) Volume IV-Volume 7472 - Volume 7472
KNOWLEDGE GRID: High Performance Knowledge Discovery on the Grid
GRID '01 Proceedings of the Second International Workshop on Grid Computing
Parallelism in Knowledge Discovery Techniques
PARA '02 Proceedings of the 6th International Conference on Applied Parallel Computing Advanced Scientific Computing
Performance characterization of data mining benchmarks
Proceedings of the 2010 Workshop on Interaction between Compilers and Computer Architecture
Approximate kernel k-means: solution to large scale kernel clustering
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
A local facility location algorithm for sensor networks
DCOSS'05 Proceedings of the First IEEE international conference on Distributed Computing in Sensor Systems
Fault tolerant decentralised K-Means clustering for asynchronous large-scale networks
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
This paper describes the design and implementation on MIMD parallel machines of P-AutoClass, a parallel version of the AutoClass system based upon the Bayesian method for determining optimal classes in large datasets. The P-AutoClass implementation divides the clustering task among the processors of a multicomputer so that they work on their own partition and exchange their intermediate results. The system architecture, its implementation and experimental performance results on different processor numbers and dataset sizes are presented and discussed. In particular, efficiency and scalability of P-AutoClass versus the sequential AutoClass system are evaluated and compared.