Generation of comprehensible decision trees through evolution of training data

Authors:
T. Endou;Qiangfu Zhao
Affiliations:
The Univ. of Aizu, Aizu-Wakamatsu, Japan;The Univ. of Aizu, Aizu-Wakamatsu, Japan
Venue:
CEC '02 Proceedings of the Evolutionary Computation on 2002. CEC '02. Proceedings of the 2002 Congress - Volume 02
Year:
2002

Citing 0
Cited 10

Evolutionary approaches to fuzzy modelling for classification

The Knowledge Engineering Review
Logistic regression using covariates obtained by product-unit neural network models

Pattern Recognition
Identification of interpretable and accurate fuzzy classifiers and function estimators with hybrid methods

Applied Soft Computing
A co-evolving decision tree classification method

Expert Systems with Applications: An International Journal
Application of wrapper approach and composite classifier to the stock trend prediction

Expert Systems with Applications: An International Journal
Quality management in GPRS networks with fuzzy case-based reasoning

Knowledge-Based Systems
Speed-up of the R4-rule for distance-based neural network learning

SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
Using data envelopment analysis and decision trees for efficiency analysis and recommendation of B2C controls

Decision Support Systems
Applications of machine learning techniques to a sensor-network-based prosthesis training system

Applied Soft Computing
Generating smart robot controllers through co-evolution

EUC'05 Proceedings of the 2005 international conference on Embedded and Ubiquitous Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In machine learning, decision trees (DTs) are usually considered comprehensible because a reasoning process can be given for each conclusion. When the data set is large, however, the DTs obtained may become very large, and they are no longer comprehensible. To increase the comprehensibility of DTs, we have proposed several methods. For example, we have tried to evolve DTs using genetic programming (GP), with tree size as the secondary fitness measure; we have tried to initialize GP using results obtained by C4.5; and we have also tried to introduce the divide-and-conquer concept in GP, but all results obtained are still not good enough. Up to now we have tried to design good DTs from given fixed data. In this paper, we look at the problem from a different point of view. The basic idea is to evolve a small data set that can cover the domain knowledge as good as possible. From this data set, a small but good DT can be designed. The validity of the new algorithm is verified through several experiments.