A Comparative Analysis of Methods for Pruning Decision Trees
IEEE Transactions on Pattern Analysis and Machine Intelligence
A Survey of Methods for Scaling Up Inductive Algorithms
Data Mining and Knowledge Discovery
Scheduling High Performance Data Mining Tasks on a Data Grid Environment
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Managing Heterogeneous Resources in Data Mining Applications on Grids Using XML-Based Metadata
IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Decision Tree Construction for Data Mining on Grid Computing
EEE '04 Proceedings of the 2004 IEEE International Conference on e-Technology, e-Commerce and e-Service (EEE'04)
Hi-index | 0.00 |
Decision trees are one of the most effective and widely used induction methods that have received a great deal of attention over the past twenty years. When decision tree induction algorithms used with uncertain rather than deterministic data, the result is complete tree, which can classify most of the unseen samples correctly. This tree would be pruned in order to reduce its classification error and over-fitting. Recently, parallel decision tree researches concentrated on dealing with large databases in reasonable amount of time. In this paper we present new parallel learning methods that are able to induce a decision tree from some overlapping partitioned training set. Our methods are based on combination of multiple induction methods; each one is running on different processor. These methods have been developed based on Kramer and fuzzy mode to control and combine the result of learning methods in order to generate the final tree. Experimental results show that if the attributes and classes in training set have uniform distribution and the size of training set are not too small, these methods result statistically lower error rate in comparison to existing methods.