High performance data mining (tutorial PM-3)
Tutorial notes of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Systems support for scalable data mining
ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
A Survey of Methods for Scaling Up Inductive Algorithms
Data Mining and Knowledge Discovery
Processing large-scale multi-dimensional data in parallel and distributed environments
Parallel Computing - Parallel data-intensive algorithms and applications
Design and Evaluation of a High-Level Interface for Data Mining
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Incremental Quantitative Rule Derivation by Multidimensional Data Partitioning
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Parallelisation of C4.5 as a Particular Divide and Conquer Computation
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
A Requirements Analysis for Parallel KDD Systems
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
Shared Memory Parallelization of Decision Tree Construction Using a General Data Mining Middleware
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Parallel and Distributed Data Mining: An Introduction
Revised Papers from Large-Scale Parallel Data Mining, Workshop on Large-Scale Parallel KDD Systems, SIGKDD
A Data-Clustering Algorithm on Distributed Memory Multiprocessors
Revised Papers from Large-Scale Parallel Data Mining, Workshop on Large-Scale Parallel KDD Systems, SIGKDD
Efficient Parallel Classification Using Dimensional Aggregates
Revised Papers from Large-Scale Parallel Data Mining, Workshop on Large-Scale Parallel KDD Systems, SIGKDD
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Data mining tasks and methods: scalability
Handbook of data mining and knowledge discovery
Handbook of data mining and knowledge discovery
IEEE Transactions on Knowledge and Data Engineering
Parallelizing a Defect Detection and Categorization Application
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Anteater: A Service-Oriented Architecture for High-Performance Data Mining
IEEE Internet Computing
Middleware for data mining applications on clusters and grids
Journal of Parallel and Distributed Computing
PMCRI: A Parallel Modular Classification Rule Induction Framework
MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
PLANET: massively parallel learning of tree ensembles with MapReduce
Proceedings of the VLDB Endowment
Mining tree-structured data on multicore systems
Proceedings of the VLDB Endowment
Performance characterization of data mining benchmarks
Proceedings of the 2010 Workshop on Interaction between Compilers and Computer Architecture
Performance-based data distribution for data mining applications on grid computing environments
The Journal of Supercomputing
Porting decision tree algorithms to multicore using fastflow
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Compiler and runtime support for shared memory parallelization of data mining algorithms
LCPC'02 Proceedings of the 15th international conference on Languages and Compilers for Parallel Computing
Data mining with parallel support vector machines for classification
ADVIS'06 Proceedings of the 4th international conference on Advances in Information Systems
Hi-index | 0.00 |
We present parallel algorithms for building decision-tree classifiers on shared-memory multiprocessor (SMP) systems. The proposed algorithms span the gamut of data and task parallelism. The data parallelism is based on attribute scheduling among processors. This basic scheme is extended with task pipelining and dynamic load balancing to yield faster implementations. The task parallel approach uses dynamic subtree partitioning among processors. Our performance evaluation shows that the construction of a decision-tree classifier can be effectively parallelized on an SMP machine with good speedup.