Combining Pattern Classifiers: Methods and Algorithms
Combining Pattern Classifiers: Methods and Algorithms
Bayesian neural networks for nonlinear time series forecasting
Statistics and Computing
Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)
Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)
Managing Diversity in Regression Ensembles
The Journal of Machine Learning Research
Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
Vector quantization: a weighted version for time-series forecasting
Future Generation Computer Systems
Multistep-Ahead time series prediction
PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Evolutionary ensembles with negative correlation learning
IEEE Transactions on Evolutionary Computation
Fast neural network ensemble learning via negative-correlation data correction
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
Data with multi-valued categorical attributes can cause major problems for decision trees. The high branching factor can lead to data fragmentation, where decisions have little or no statistical support. In this paper, we propose a new ensemble method, Random Ordinality Ensembles (ROE), that circumvents this problem, and provides significantly improved accuracies over other popular ensemble methods. We perform a random projection of the categorical data into a continuous space by imposing random ordinality on categorical attribute values. A decision tree that learns on this new continuous space is able to use binary splits, hence avoiding the data fragmentation problem. A majority-vote ensemble is then constructed with several trees, each learnt from a different continuous space. An empirical evaluation on 13 datasets shows this simple method to significantly outperform standard techniques such as Boosting and Random Forests. Theoretical study using an information gain framework is carried out to explain RO performance. Study shows that ROE is quite robust to data fragmentation problem and Random Ordinality (RO) trees are significantly smaller than trees generated using multi-way split.