Consolidated trees: classifiers with stable explanation. a model to achieve the desired stability in explanation

Authors:
Jesús M. Pérez;Javier Muguerza;Olatz Arbelaitz;Ibai Gurrutxaga;José I. Martín
Affiliations:
Dept. of Computer Architecture and Technology, University of the Basque Country, Donostia, Spain;Dept. of Computer Architecture and Technology, University of the Basque Country, Donostia, Spain;Dept. of Computer Architecture and Technology, University of the Basque Country, Donostia, Spain;Dept. of Computer Architecture and Technology, University of the Basque Country, Donostia, Spain;Dept. of Computer Architecture and Technology, University of the Basque Country, Donostia, Spain
Venue:
ICAPR'05 Proceedings of the Third international conference on Advances in Pattern Recognition - Volume Part I
Year:
2005

Citing 5
Cited 1

C4.5: programs for machine learning

C4.5: programs for machine learning
Technical Note: Bias and the Quantification of Stability

Machine Learning - Special issue on bias evaluation and selection
Approximate statistical tests for comparing supervised classification learning algorithms

Neural Computation
Knowledge Acquisition form Examples Vis Multiple Models

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Exploiting the Cost (In)sensitivity of Decision Tree Splitting Criteria

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning

Decision Tree Instability and Active Learning

ECML '07 Proceedings of the 18th European conference on Machine Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

In real world problems solved with machine learning techniques, achieving small error rates is important, but there are situations where an explanation is compulsory. In these situations the stability of the given explanation is crucial. We have presented a methodology for building classification trees, Consolidated Trees Construction Algorithm (CTC). CTC is based on subsampling techniques, therefore it is suitable to face class imbalance problems, and it improves the error rate of standard classification trees and has larger structural stability. The built trees are more steady as the number of subsamples used for induction increases, and therefore also the explanation related to the classification is more steady and wider. In this paper a model is presented for estimating the number of subsamples that would be needed to achieve the desired structural convergence level. The values estimated using the model and the real values are very similar, and there are not statistically significant differences.