Parallel learning using decision trees: a novel approach

Authors:
Sattar Hashemi;Mohammad R. Kangavari
Affiliations:
Department of Computer Engineering, Iran University of Science and Technology, Tehran, Iran;Department of Computer Engineering, Iran University of Science and Technology, Tehran, Iran
Venue:
AMCOS'05 Proceedings of the 4th WSEAS International Conference on Applied Mathematics and Computer Science
Year:
2005

Citing 7
Cited 0

A Comparative Analysis of Methods for Pruning Decision Trees

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms

Machine Learning
A Survey of Methods for Scaling Up Inductive Algorithms

Data Mining and Knowledge Discovery
An Empirical Comparison of Selection Measures for Decision-Tree Induction

Machine Learning
Scheduling High Performance Data Mining Tasks on a Data Grid Environment

Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Managing Heterogeneous Resources in Data Mining Applications on Grids Using XML-Based Metadata

IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Decision Tree Construction for Data Mining on Grid Computing

EEE '04 Proceedings of the 2004 IEEE International Conference on e-Technology, e-Commerce and e-Service (EEE'04)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Decision trees are one of the most effective and widely used induction methods that have received a great deal of attention over the past twenty years. When decision tree induction algorithms used with uncertain rather than deterministic data, the result is complete tree, which can classify most of the unseen samples correctly. This tree would be pruned in order to reduce its classification error and over-fitting. Recently, parallel decision tree researches concentrated on dealing with large databases in reasonable amount of time. In this paper we present new parallel learning methods that are able to induce a decision tree from some overlapping partitioned training set. Our methods are based on combination of multiple induction methods; each one is running on different processor. These methods have been developed based on Kramer and fuzzy mode to control and combine the result of learning methods in order to generate the final tree. Experimental results show that if the attributes and classes in training set have uniform distribution and the size of training set are not too small, these methods result statistically lower error rate in comparison to existing methods.