C4.5: programs for machine learning
C4.5: programs for machine learning
On the boosting ability of top-down decision tree learning algorithms
STOC '96 Proceedings of the twenty-eighth annual ACM symposium on Theory of computing
Improved boosting algorithms using confidence-rated predictions
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Machine Learning
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Instance Pruning as an Information Preserving Problem
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Combining Feature and Example Pruning by Uncertainty Minimization
UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Simplifying decision trees: A survey
The Knowledge Engineering Review
Identifying and eliminating mislabeled training instances
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Hi-index | 0.00 |
In the Knowledge Discovery in Databases (KDD) field, the human comprehensibility of models is as important as the accuracy optimization. To address this problem, many methods have been proposed to simplify decision trees and improve their understandability. Among different classes of methods, we find strategies which deal with this problem by a priori reducing the database, either through feature selection or case selection. At the same time, many other efficient selection algorithms have been developed in order to reduce storage requirments of case-based learning algorithms. Therefore, their original aim is not the tree simplification. Surprisingly, as far as we know, few works have attempted to exploit this wealth of efficient algorithms in favor of knowledge discovery. This is the aim of this paper. we analyze through large experiments and discussions the contribution of the state-of-the-art reduction techniques and instances. We show that in some cases, this algorithms is very efficient to improve the standard post-pruning performances, used to combat the overfitting problem.