Consolidated trees: an analysis of structural convergence

Authors:
Jesús M. Pérez;Javier Muguerza;Olatz Arbelaitz;Ibai Gurrutxaga;José I. Martín
Affiliations:
Dept. of Computer Architecture and Technology, University of the Basque Country, Donostia, Spain;Dept. of Computer Architecture and Technology, University of the Basque Country, Donostia, Spain;Dept. of Computer Architecture and Technology, University of the Basque Country, Donostia, Spain;Dept. of Computer Architecture and Technology, University of the Basque Country, Donostia, Spain;Dept. of Computer Architecture and Technology, University of the Basque Country, Donostia, Spain
Venue:
Data Mining
Year:
2006

Citing 12
Cited 0

C4.5: programs for machine learning

C4.5: programs for machine learning
Technical Note: Bias and the Quantification of Stability

Machine Learning - Special issue on bias evaluation and selection
Bagging predictors

Machine Learning
Approximate statistical tests for comparing supervised classification learning algorithms

Neural Computation
An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization

Machine Learning
An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants

Machine Learning
Knowledge Acquisition form Examples Vis Multiple Models

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Exploiting the Cost (In)sensitivity of Decision Tree Splitting Criteria

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Bagging and Boosting for the Nearest Mean Classifier: Effects of Sample Size on Diversity and Accuracy

MCS '02 Proceedings of the Third International Workshop on Multiple Classifier Systems
Boosted Tree Ensembles for Solving Multiclass Problems

MCS '02 Proceedings of the Third International Workshop on Multiple Classifier Systems
Learning when training data are costly: the effect of class distribution on tree induction

Journal of Artificial Intelligence Research
The foundations of cost-sensitive learning

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2

Quantified Score

Hi-index	0.02

Visualization

Abstract

When different subsamples of the same data set are used to induce classification trees, the structure of the built classifiers is very different. The stability of the structure of the tree is of capital importance in many domains, such as illness diagnosis, fraud detection in different fields, customer’s behaviour analysis (marketing), etc, where comprehensibility of the classifier is necessary. We have developed a methodology for building classification trees from multiple samples where the final classifier is a single decision tree (Consolidated Trees). The paper presents an analysis of the structural stability of our algorithm versus C4.5 algorithm. The classification trees generated with our algorithm, achieve smaller error rates and structurally more steady trees than C4.5 when using resampling techniques. The main focus on this paper is showing how Consolidated Trees built with different sets of subsamples tend to converge to the same tree when the number of used subsamples is increased.