Diversification for better classification trees
Computers and Operations Research
Global Induction of Decision Trees: From Parallel Implementation to Distributed Evolution
ICAISC '08 Proceedings of the 9th international conference on Artificial Intelligence and Soft Computing
An Optimal Constrained Pruning Strategy for Decision Trees
INFORMS Journal on Computing
A memetic algorithm for global induction of decision trees
SOFSEM'08 Proceedings of the 34th conference on Current trends in theory and practice of computer science
Evolutionary design of decision trees for medical application
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Hi-index | 0.00 |
In dealing with a very large data set, it might be impractical to construct a decision tree using all of the points. Even when it is possible, this might not be the best way to utilize the data. As an alternative, subsets of the original data set can be extracted, a tree can be constructed on each subset, and then parts of individual trees can be combined in a smart way to produce an improved final set of feasible trees or a final tree. In this paper, we take trees generated by a commercial decision tree package, namely, C4.5, and allow them to crossover and mutate (using a genetic algorithm) for a number of generations in order to yield trees of better quality. We conduct a computational study of our approach using a real-life marketing data set. In this study, we divide the data set into training, scoring, and test sets, and find that our approach produces uniformly high-quality decision trees. In addition, we investigate the impact of scaling and demonstrate that our approach can be used effectively on very large data sets.