C4.5: programs for machine learning
C4.5: programs for machine learning
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Inference for the Generalization Error
Machine Learning
The Journal of Machine Learning Research
Ensembles of nested dichotomies for multi-class problems
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Solving multiclass learning problems via error-correcting output codes
Journal of Artificial Intelligence Research
HISSCLU: a hierarchical density-based method for semi-supervised clustering
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Human Distress Sound Analysis and Characterization Using Advanced Classification Techniques
SETN '08 Proceedings of the 5th Hellenic conference on Artificial Intelligence: Theories, Models and Applications
Proceedings of the 2nd International Conference on PErvasive Technologies Related to Assistive Environments
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
Pattern Recognition Letters
Large Margin Hierarchical Classification with Mutually Exclusive Class Membership
The Journal of Machine Learning Research
Engineering Applications of Artificial Intelligence
Evaluating data mining algorithms using molecular dynamics trajectories
International Journal of Data Mining and Bioinformatics
Hi-index | 0.00 |
A system of nested dichotomies is a hierarchical decomposition of a multi-class problem with c classes into c–1 two-class problems and can be represented as a tree structure. Ensembles of randomly-generated nested dichotomies have proven to be an effective approach to multi-class learning problems [1]. However, sampling trees by giving each tree equal probability means that the depth of a tree is limited only by the number of classes, and very unbalanced trees can negatively affect runtime. In this paper we investigate two approaches to building balanced nested dichotomies—class-balanced nested dichotomies and data-balanced nested dichotomies—and evaluate them in the same ensemble setting. Using C4.5 decision trees as the base models, we show that both approaches can reduce runtime with little or no effect on accuracy, especially on problems with many classes. We also investigate the effect of caching models when building ensembles of nested dichotomies.