Global Induction of Decision Trees: From Parallel Implementation to Distributed Evolution

  • Authors:
  • Marek Kretowski;Piotr Popczyński

  • Affiliations:
  • Faculty of Computer Science, Białystok Technical University, Białystok, Poland 15-351;Faculty of Computer Science, Białystok Technical University, Białystok, Poland 15-351

  • Venue:
  • ICAISC '08 Proceedings of the 9th international conference on Artificial Intelligence and Soft Computing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In most of data mining systems decision trees are induced in a top-down manner. This greedy method is fast but can fail for certain classification problems. As an alternative a global approach based on evolutionary algorithms (EAs) can be applied. We developed Global Decision Tree(GDT) system, which learns a tree structure and tests in one run of the EA. Specialized genetic operators are used, which allow the system to exchange parts of trees, generate new sub-trees, prune existing ones as well as change the node type and the tests. The system is able to induce univariate, oblique and mixed decision trees. In the paper, we investigate how the GDTsystem can profit from a parallelization on a compute cluster. Both parallel implementation and distributed version of the induction are considered and significant speedups are obtained. Preliminary experimental results show that at least for certain problems the distributed version of the GDTsystem is more accurate than its panmictic predecessor.