Algorithms for clustering data
Algorithms for clustering data
Bayesian classification (AutoClass): theory and results
Advances in knowledge discovery and data mining
Data mining: concepts and techniques
Data mining: concepts and techniques
Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey
Data Mining and Knowledge Discovery
Parallel Mining of Association Rules
IEEE Transactions on Knowledge and Data Engineering
IEEE Transactions on Knowledge and Data Engineering
Proceedings of the 2006 workshop on Memory system performance and correctness
Adaptive Parallel Graph Mining for CMP Architectures
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Valgrind: a framework for heavyweight dynamic binary instrumentation
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Evaluating MapReduce for Multi-core and Multiprocessor Systems
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Parallel Data Mining on Multicore Clusters
GCC '08 Proceedings of the 2008 Seventh International Conference on Grid and Cooperative Computing
Parallel processing for stepwise generalisation method on multi-core PC cluster
International Journal of Knowledge and Web Intelligence
Hi-index | 0.00 |
The deluge of available data for analysis demands the need to scale the performance of data mining implementations. With the current architectural trends, one of the major challenges today is achieving programmability and performance for data mining applications on multi-core machines and cluster of multi-core machines. To address this problem, we have been developing a runtime framework, FREERIDE, that enables parallel execution of data mining and data analysis tasks.The contributions of this paper are two-fold: 1) This paper describes and evaluates various shared-memory parallelization techniques developed in our run-time system on a cluster of multi-cores, and 2) We report on a detailed performance study to understand why certain parallelization techniques out-perform othertechniques for a particular application.